AI
Generative Agents: Interactive Simulacra of Human Behavior
Authors use language models to create a believable The Sims-like simulation of a small town
Three modules:
Memory stream: keeps record of agent experiences, retrieval is used to get memories into context
Reflection: LLM used to transform observations from memory stream to higher level concepts (summaries). Triggered when the memory stream grows too large.
Planning: LLM is used to make plans. LLM is prompted based on memory stream observations to update plans.
ChatGPT used as the LLM of choice
Creepy AI vibes
Shows how far you can go with just clever prompt engineering
ML
LLMS
Quite small but effective free LLM with image-text capabilities
Released weights
Aligns a frozen visual encoder from BLIP-2 with a frozen LLM, Vicuna (13B), using just one projection layer
Model can answer questions about images, read handwriting, and generate rap songs inspired by images.
Stability AI Launches the First of its StableLM Suite of Language Models
New open-source LLM, 3B to 7B parameters
New dataset built on top The Pile, but three times larger. Not released to the public yet.
Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
Dolly 2.0: instruction-following ChatGPT-like model by Databricks
Full open source, the license allows commercial use
Based on EleutherAI pythia, 12B parameters
Dataset of instructions databricks-dolly-15k also open source. Larger than OpenAI’s InstructGPT dataset (13k).
DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Quote: A single script capable of taking a pre-trained Huggingface model, running it through all three steps of InstructGPT training using DeepSpeed-RLHF system and producing your very own ChatGPT like model
Basically an easy-to-use implementation of the InstructGPT pipeline.
Heavily optimized engine for inference of ChatGPT-like models.
Free Copilot alternative
Seems to be Hugginface integrated into AWS
Experimenting with LLMs to Research, Reflect, and Plan
The author makes bots with ChatGPT to summarize URLs, make SQL queries, search, imitate famous people, Q&A based on his own writing.
Used LangChain, Railway for hosting, Pinecone for context vector storage.
Notable papers
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
New work on text-to-video generation by Nvidia.
Most impressive result in this area so far.
Internet Explorer: Targeted Representation Learning on the Open Web
It is a way to pre-train a CLIP-like model for mapping text to images. The main gain is efficiency.
“Our approach, called Internet Explorer, explores the web in a self-supervised manner to progressively find relevant examples that improve performance on a desired target dataset”
Model samples text queries from a prior concept distribution (could be a GPT model), searches for images and downloads the top 100, updates prior concepts, and self-supervised learning on downloaded images using contrastive loss.
Model learns to make better queries over time.
Authors train a ResNet-50 in this way. Beats the usual CLIP ResNet-50 trained on datasets Birdsnap, Flowers, Pets and fMoW while using only 2.5% as much compute and 0.5% as much data.
Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions
TLDR ChatGPT is bad at playing text games and building a world model
Authors make ChatGPT play Zork. They probe the ability of the model to build a map of the game world.
DINOv2: Learning Robust Visual Features without Supervision
Basically DINO with more data, bigger scale, and optimizations.
SAM: zero-shot model to segment multiple objects on images.
Interactive: point at entity to segment it, bounding box, mask or text description
Image encoder is a ViT trained via MAE.
Trained on 11m images and 1.1b masks dataset. Available for download.
SiLK -- Simple Learned Keypoints
One-stage model to find keypoints
Trained in a self-supervised manner
Thoughts, news and the rest
Google DeepMind: Bringing together two world-class AI teams
Building LLM applications for production
Behind the curtain: what it feels like to work in AI right now
ChatGPT is the iPhone moment for AI.
“AI Safety is a real problem that is entering the discourse as a public problem.”
Being an ML influencer is easy, but pushing stuff forward is hard.
“… working in AI feels like the candle that burns bright and short. I'm oscillating between the most motivated I've ever been and some of the closest to burnt-out I've ever felt.”
Slowing down development of AI systems passing the Turing test
Yoshua Bengio on why he signed the “Pause Giant AI Experiments” letter
“We succeeded in regulating nuclear weapons on a global scale after World War II, we can reach a similar agreement for AI.”
Today’s special — scams and dangers
AI clones teen girl’s voice in $1M kidnapping scam: ‘I’ve got your daughter’
Iran installs cameras in public places to identify, penalise unveiled women