Video Library

Videos

Highlights, deep dives, and short-form explainers across AI, data science, and policy.

Find current videos at TikTok, Instagram, or YouTube. I also keep a public Google Sheet with links and descriptions.

Deep Dive Videos

Train Mixture of Experts Model from Scratch - Simpsons Edition, Youtube (Nov 2025)

A Practical Guide to Evaluating Generative AI Applications - Updated, Youtube (Nov 2025) · Blog post

RAG Retrieval Deep Dive: BM25, Embeddings, and the Power of Agentic Search, Youtube (Oct 2025) · Blog post

Evaluation for Generative AI - A simply explained starting point, Youtube (May 2025)

Using Reasoning LLMs (Claude with Python or Agno), Youtube (Apr 2025)

Get Started with Deepseek's GRPO using QWEN and Hugging Face, Youtube (Feb 2025)

Unit Testing for Natural Language (LLMs) + LMUnit model, Youtube (Feb 2025)

Training Kolmogorov-Arnold Networks (KAN) using Pytorch and Nixtla on M3/M4 Time Series Datasets, Youtube (Nov 2024)

Feature Selection Methods for Machine Learning, plus Feature Selection Curves, Youtube (Oct 2024) · Blog post

Start using Llama 3.2 Vision Models with Hugging Face Transformers (on Snowflake), Youtube (Oct 2024)

Practical Lessons in Building Generative AI: RAG and Text to SQL, Youtube (Sep 2024) · Blog post

Spark of AI: How Transfer Learning Unlocked AI's Potential, Youtube (Sep 2024) · Blog post

Interpretable ML Models, Youtube (Aug 2024) · Blog post

Intro to Generative AI and Trends (March 2024), Youtube (Jun 2024)

Model Interpretability and Explainability for Machine Learning Models, Youtube (Jun 2024)

Large Language Models (LLMs) Can Explain Their Predictions, Youtube (Jan 2024)

Evaluation for Large Language Models and Generative AI - A Deep Dive, Youtube (Nov 2023) · Blog post

NanoGPT using Simpsons Data: Get Started with Large Language Models, Youtube (Sep 2023)

16 Challenges for LLMs - Paper Highlights, Youtube (Aug 2023)

Llama 2 Paper Explained, Youtube (July 2023)

GPT or BERT? Reviewing the tradeoffs of using Large Language Models versus smaller models, Youtube (Jun 2023)

Building Better Large Language Models - Key Concepts for Prompting and Fine Tuning, Youtube (Apr 2023)

Efficient Large Language Model training with LoRA and Hugging Face PEFT, Youtube (Mar 2023)

Text style transfer in a spreadsheet using Hugging Face Inference Endpoints, Youtube (Nov 2022) · Blog post

SetFit: Few Shot Learning for Text Classification, Youtube (Oct 2022) · Blog post

Prediction Intervals with Conformal Inference: An Intuitive Explanation, Youtube (Sep 2022) · Blog post

LayoutLMv3 Training with CORD (receipts dataset), Youtube (Sep 2022)

Fine Tuning an Image Classifier on Indian Food Images, Youtube (Aug 2022)

Explanation Approaches for Transformers, Youtube (Aug 2022) · Blog post

Short Form Videos

Search by topic, keyword, or platform.

Video

Sampling bias

2,682,156 views

2023-12-25

Sampling bias

#bias#sampling

Reinforcement learning with my Eat Melon! Demo based ...

565,300 views

2022-04-05

Reinforcement learning with my Eat Melon! Demo based on Karpathy #datascience #reinforcementlearning #techtok #machinelearning

#datascience#learning#machinelearning#reinforcement#reinforcementlearning#techtok
Video

Google dropped Gemini. Let's talk about the different...

253,633 views

2023-12-06

Google dropped Gemini. Let's talk about the different sizes tweaked benchmarks multimodal trained on TPUs and how it's not that exciting. #g...

#different#dropped#gemini#google#gpt4#let#sizes#talk#tweaked
Video

Sampling bias

235,783 views

2023-12-25

Sampling bias

#bias#enjoy#holidays#original#sampling#see#sketchplanations
Video

Curse of dimensionality reminds us to think carefully...

166,921 views

2024-02-11

Curse of dimensionality reminds us to think carefully about feature selection. More isn’t always better. Use a feature selection curve. #d...

#curseofdimensionality#datascience#feature#featureselection#machinelearning#selection
Video

There are lots of open-source code assistant tools. S...

90,205 views

2023-09-30

There are lots of open-source code assistant tools. Starcoder is the best known but many people are training and fine-tuning their own model...

#bigcode#copilot#github#languagemodels#machinelearning#models#open#refact#source#sqlcoder#starcoder#tabby

Toolformer from Meta shows the possibilities of using...

88,831 views

2023-02-13

Toolformer from Meta shows the possibilities of using APIs in an unsupervised way. #datascience #machinelearning #toolformer #largelanguagem...

#apis#datascience#largelanguagemodels#machinelearning#meta#model#shows#toolformer#tools

Transformer Explainer is an interactive visualization...

69,100 views

2024-08-11

Transformer Explainer is an interactive visualization tool to allow people to understand how transformers work through an end-to-end visuali...

#end#explainer#information#transformer#transformers#visualization
Video

Text to SQL is now easier with a large language model...

68,107 views

2023-07-06

Text to SQL is now easier with a large language model released by Numbers Station called NSQL. #largelanguagemodels #nsql #numberstation #ma...

#introducing#largelanguagemodels#machinelearning#model#nsql#numbers#numberstation#sequel#sql#station#text

Med-Palm from Google for answering medical and clinic...

68,100 views

2022-12-29

Med-Palm from Google for answering medical and clinical knowledge. #datascience #machinelearning #largelanguagemodels #medpalm

#datascience#largelanguagemodels#machinelearning#medpalm#model#models
Video

Control vectors are getting more widely supported mos...

65,529 views

2024-03-17

Control vectors are getting more widely supported most recently in Llama.cpp. It’s another useful technique alongside prompting fine tunin...

#control#engineering#getting#like#prompting#representation#vectors#vgel

Clustering with k-means. This skit was inspired by th...

64,800 views

2022-12-31

Clustering with k-means. This skit was inspired by the examples in Schubert paper on stop using the elbow criterion for kmeans. Any other cl...

#clustering#kmeans#like#look#means#one

Very excited and Richard isn’t paying me for this - #...

64,100 views

2022-05-06

Very excited and Richard isn’t paying me for this - #codetok #youdotcom #codingtiktok #python

#code#codetok#codingtiktok#like#python#youdotcom

Climax, a new transformer based model for predicting ...

61,400 views

2023-02-08

Climax, a new transformer based model for predicting weather and climate forecasting. Great example of the flexibility of transformers based...

#climatemodel#climax#datascience#machinelearning#models#transformers
Video

NASA uses generative AI for manufacturing parts for s...

61,097 views

2023-10-28

NASA uses generative AI for manufacturing parts for space. It's a great use of generative technology and you can start seeing how it will ch...

#61#design#generative#generativeai#manufacturing#nasa
Video

AI News for the week featuring OpenAI NVIDIA Google A...

59,515 views

2023-10-21

AI News for the week featuring OpenAI NVIDIA Google Apple Stanford Hugging Face Anthropic and Microsoft. #machinelearning #openai #rajistics...

#anthropic#featuring#google#machinelearning#news#nvidia#openai#stanford#week

Simply explaining how ChatGPT works. All the technica...

56,100 views

2022-12-07

Simply explaining how ChatGPT works. All the technical details of ChatGPT have not been released, so this is based on what OpenAI has been d...

#answers#chatgpt#datascience#machinelearning#openai#reinforcementlearning

Twitter open sourced it's recommendation algorithm. I...

55,500 views

2023-04-01

Twitter open sourced it's recommendation algorithm. It's fun to look at someone else's production code and will be useful to people studying...

#code#datascience#fun#machinelearning#recommenders#twitter

Tensorboard embedding projector - repost

52,400 views

2024-07-01

Tensorboard embedding projector - repost

#also#cold#concepts#look#see#tensorboard
Video

#onthisday tensorboard embedding projector. Let me kn...

49,500 views

2023-07-01

#onthisday tensorboard embedding projector. Let me know if i should reshare these older videos.

#embedding#know#let#onthisday#projector#tensorboard

This happens. #datascience #machinelearning #python #...

48,900 views

2022-05-04

This happens. #datascience #machinelearning #python #codetok #programming

#codetok#datascience#happens#machinelearning#programming#python
Video

Vector Databases (Pinecone)

48,861 views

2023-04-17

Vector Databases (Pinecone)

#chroma#databases#embeddings#faiss#milvus#pinecone#vector#weaviate

Some general advice on how to evaluate software packa...

47,700 views

2022-11-30

Some general advice on how to evaluate software packages. #datascience #machinelearning #github

#github#great#look#package#see#want

Point-E from #openai. Generating 3D point clouds from...

46,400 views

2022-12-20

Point-E from #openai. Generating 3D point clouds from text #datascience #machinelearning

#clouds#datascience#machinelearning#model#openai#point

GPT4 hype that it will be 100 trillion parameters doe...

45,500 views

2023-01-17

GPT4 hype that it will be 100 trillion parameters doesn’t make any sense. First, is the scaling laws in this video @rajistics and also thin...

#datascience#gpt#gpt4#hype#machinelearning#openai

Some common data distributions when modeling includin...

45,300 views

2023-02-03

Some common data distributions when modeling including skewed and zero inflated. There are many other distributions, but just wanted people ...

#common#data#datadistribution#datascience#statistics#zeroinflated
Video

Open source LLMs why they seem popular are not easy t...

43,496 views

2023-06-03

Open source LLMs why they seem popular are not easy to get running in production settings. The current open source LLMs while getting better...

#anthropic#common#crawl#datascience#flant5#language#large#largelanguagemodels#like#machinelearning#models#open#openai#source#using
Video

A simple explanation of what AI is. The video touches...

43,250 views

2023-06-25

A simple explanation of what AI is. The video touches upon the impact of AI how AI works with a practical example and some of the reasons AI...

#ai#aiexplained#datascience#explanation#let#machinelearning#simple#years

Stable diffusion, go run it yourself! It’s so awesome...

42,600 views

2022-08-25

Stable diffusion, go run it yourself! It’s so awesome. #datascience #codetok #aipub #huggingface #machinelearning

#aipub#codetok#datascience#diffusion#huggingface#something

Reply to @bird_3288 #python #rstat #datascience #anal...

41,600 views

2022-03-12

Reply to @bird_3288 #python #rstat #datascience #analytics #programming

#analytics#datascience#languages#programming#python#rstat

Working with embeddings today. #datascience #word2ve...

41,200 views

2022-07-01

Working with embeddings today. #datascience #word2vec #embeddings #tensorflow #codetok #tensorboard

#datascience#embeddings#see#tensorboard#tensorflow#word2vec

Data Centric AI helps to remind us not to focus too m...

40,900 views

2023-02-24

Data Centric AI helps to remind us not to focus too much on the model or algorithms. In real data science, it’s more about understanding you...

#cleanlab#data#datascience#machinelearning#model#much

It’s important to make sure your model is well calibr...

40,300 views

2022-11-11

It’s important to make sure your model is well calibrated. This becomes especially important with imbalanced data. #machinelearning #datasci...

#datascience#disease#look#machinelearning#model#statistics
Video

The tech world has so many reactions to OpenAI firing...

39,745 views

2023-11-18

The tech world has so many reactions to OpenAI firing Sam Altman. Here are some very quick reactions. #openai #rajistics

#altman#back#firing#like#many#open#openai#reactions#sam#tech#world
Video

üöÄ Just get started on your journey to learn large ...

39,440 views

2023-07-26

🚀 Just get started on your journey to learn large language models! 🤔 Is there a lot to learn? Yes! 😅 🤷‍♂️ But is it easy t...

#colab#datascience#get#google#largelanguagemodels#llama#llama2#machinelearning#run#started

This paper introduces vec2vec, a method that aligns t...

39,208 views

2025-05-22

This paper introduces vec2vec, a method that aligns text embeddings from different language models—without access to the models or labeled d...

#aspects#different#embedding#embeddings#like#model#models#platonic#universal#vec2vec#vectavect#vector

Diffusion models for markup. #datascience #machinelea...

38,700 views

2022-10-13

Diffusion models for markup. #datascience #machinelearning #stablediffusion

#datascience#diffusion#machinelearning#markup#model#stablediffusion
Video

So much going on around using generative tools for re...

38,609 views

2023-04-12

So much going on around using generative tools for reasoning with tasks. HuggingGPT or Jarvis is focused on helping on solving AI tasks. Aut...

#auto#autogpt#catchup#datascience#generative#gpt#gpt3#hugginggpt#jarvis#machinelearning#sims
Video

Interpretable models offer a great alternative to tra...

38,528 views

2024-03-09

Interpretable models offer a great alternative to traditional machine learning algorithms. Generalized Additive Models like GA2M Rulefit and...

#algorithms#imodels#interpretable#learning#like#machine#models#offer#rules

Google’s sparrow is the rumored competitor to OpenAI ...

37,900 views

2023-01-22

Google’s sparrow is the rumored competitor to OpenAI ChatGPT. Check out the paper to see lots of examples of it chatting. It looks really go...

#chatgpt#datascience#google#machinelearning#openai#sparrow

A GPT-5 + Gemini Pro pipeline might sort one apple ev...

35,944 views

2025-10-01

A GPT-5 + Gemini Pro pipeline might sort one apple every 6 seconds. A simple light frequency sensor sorts hundreds per second. The outcome? ...

#color#even#every#fruit#gemini#gpt#light#might#per#pipeline#pro#second#seconds#sensor#showdown#sort#sorting
Video

Some alternatives to clustering with k-means. This sk...

35,093 views

2023-12-31

Some alternatives to clustering with k-means. This skit was inspired by the examples in Schubert paper on stop using the elbow criterion for...

#clustering#datascience#kmeans#machinelearning#means#statistics

This video explains the findings from the Google Rese...

34,282 views

2025-07-26

This video explains the findings from the Google Research paper "Learning Without Training: The Implicit Dynamics of In-Context Learning" (a...

#context#deep#implicit#learning#model#models#need#paper#training#understanding#update#video
Video

Always have a baseline model. For time series you can...

33,531 views

2023-10-09

Always have a baseline model. For time series you can often compare to what happened in a previous time step like last week. There are error...

#codetok#datascience#statistics#time#timeseries#timeseriesforcasting

No big deal, use visualization #stats #datascience #...

32,200 views

2022-01-14

No big deal, use visualization #stats #datascience #datasaurus #datascience #analytics #anscombe #visualization

#analytics#anscombe#datasaurus#datascience#stats#visualization

Customer lifetime value is a common data science use ...

31,700 views

2023-02-25

Customer lifetime value is a common data science use case. There are many ways to calculate this, but here I introduce the class RFM method ...

#customer#customerlifetimevalue#datascience#machinelearning#marketing#rfm
Video

Knowledge distillation is a useful technique to build...

31,648 views

2024-03-10

Knowledge distillation is a useful technique to build smaller high-performing models. DistilBERT is a great example of a widely used model t...

#arxiv#distilbert#distillation#example#knowledge#larger#model#org#pdf#smaller

In 2023, Meta intern Guangxuan Xiao discovered that r...

31,543 views

2025-08-09

In 2023, Meta intern Guangxuan Xiao discovered that removing the first few tokens in a sliding-window KV cache caused catastrophic degradati...

#attention#attentions#enabled#first#fix#llms#model#openai#sinks#sliding#streaming#tokens#xiao

Learning curves, it’s a technique I use all the time ...

31,500 views

2022-11-16

Learning curves, it’s a technique I use all the time when training models. Thanks to Todd C for showing me the best way to explain this. #da...

#bigdata#data#datascience#machinelearning#need#statistics

The Data Scientist title is worth $$$ €€€ £££ ¥¥¥. #d...

30,800 views

2022-02-22

The Data Scientist title is worth $$$ €€€ £££ ¥¥¥. #datascience #dataanalyst #analytics

#alert#analytics#bag#dataanalyst#datascience#major

This video explores Apple’s recent study on large rea...

30,563 views

2025-06-08

This video explores Apple’s recent study on large reasoning models and why they often fail to actually “reason.” It covers controlled puzzle...

#apple#claude#don#hands#harder#models#notebook#openai#reasoning#tasks#thinking
Video

Reminder to visualize your data with one of my favori...

30,389 views

2023-10-30

Reminder to visualize your data with one of my favorites Anscombe's quartet #anscombesquartet #datavisualization #datascience #statistics #r...

#anscombe#anscombesquartet#data#datascience#datavisualization#example#quartet#reminder#statistics#using#visualize#visualizing

Vicuna is awesome, go check it out. Its the latest LL...

29,200 views

2023-03-31

Vicuna is awesome, go check it out. Its the latest LLama model and very impressive. I ended up cutting out the details on vicuna, since i fe...

#datascience#llama#machinelearning#open#openai#vicuna

Dealing with over plotting, another visualization tip...

28,800 views

2023-01-08

Dealing with over plotting, another visualization tips from data to viz #datascience #machinelearning #statistics #datavisualization

#data#datascience#datavisualization#machinelearning#statistics#use

7 Baseline Models: Time Series: Previous Value Anomal...

28,700 views

2024-06-22

7 Baseline Models: Time Series: Previous Value Anomaly: p99 Search: BM25 Recommendation: Popularity Buy recommendations: last viewed Classif...

#baseline#models#recommendation#search#use#value
Video

I think langchain is aweome, but the future is an eas...

28,574 views

2023-03-17

I think langchain is aweome, but the future is an easy to use UI. Think Alteryx for LLMs. Langflow is a step in the right direction. #datasc...

#datascience#gpt4#langchain#langflow#largelanguagemodels#like#machinelearning#openai
Video

The power of prompting! How to use a general purpose ...

28,550 views

2023-11-29

The power of prompting! How to use a general purpose model to be a special purpose fine tuned model. It's really important to learn good pro...

#gpt4#largelanguagemodels#medpalm#model#openai#promptengineering#prompting#purpose#want

Scaling laws help us figure out how manage the amount...

28,500 views

2023-01-07

Scaling laws help us figure out how manage the amount of training data versus the model size. DeepMind showed with Chinchilla by using more ...

#data#deepmind#laws#model#models#scaling

Visual question answering (VQA) is another cool task ...

28,500 views

2022-12-03

Visual question answering (VQA) is another cool task you can do with machine learning. #datascience #machinelearning #visualquestionanswerin...

#ask#datascience#image#machinelearning#question#visualquestionanswering
Video

Segment Anything (SAM) is a new segmentation model fr...

27,500 views

2023-04-06

Segment Anything (SAM) is a new segmentation model from Meta. It's a huge improvement over the state of the art and is going to change compu...

#computervision#datascience#gpt3#imagesegmentation#llama#machinelearning#meta#openai#segmentanything#vicuna

DeepSeekv3 is turning heads - the paper is also reall...

27,301 views

2024-12-26

DeepSeekv3 is turning heads - the paper is also really good, check it all out at: https://github.com/deepseek-ai/DeepSeek-V3

#art#chinese#december#deepseek#deepseekv3#like#million#model#models#released#state#training
Video

QLoRA - Efficient Finetuning of Quantized LLMs

26,714 views

2023-05-26

QLoRA - Efficient Finetuning of Quantized LLMs

#datascience#efficient#finetuning#largelanguagemodels#llms#lora#machinelearning#peft#qlora#quantized
Video

YOLO is a seminal model in object detection for compu...

26,331 views

2023-09-13

YOLO is a seminal model in object detection for computer vision. But what is even more interesting is the principal author Joseph Redmon and...

#abs#arxiv#back#detection#like#model#object#org#paper#story#time#yolo

Pandas 2.0 combing with arrow. A short recap on how i...

25,100 views

2023-03-01

Pandas 2.0 combing with arrow. A short recap on how it fits in with polars, dplyr, and data.table. #datascience #machinelearning #rstats #py...

#datascience#datatable#dplyr#machinelearning#pandas#polars

RuterGPT is an inspirational story. Fine-tuning base ...

24,700 views

2024-05-02

RuterGPT is an inspirational story. Fine-tuning base models allows you to do so much, including better language support. Check out the story...

#fine#finetuning#largelanguagemodels#model#norwegian#vasamuseet
Video

Customer lifetime value is a common data science use ...

24,113 views

2024-02-29

Customer lifetime value is a common data science use case. There are many ways to calculate this but here I introduce the classic RFM method...

#customer#customerlifetimevalue#datascience#machinelearning#marketinganalytics#rfm

YouChat and retrieval augmented models. To play aroun...

23,900 views

2022-12-26

YouChat and retrieval augmented models. To play around with this, check out haystack from deepset. #datascience #machinelearning #youchat #...

#chatgpt#datascience#machinelearning#openai#retrievalaugmentedmodel#youchat

Rust for machine learning. It’s useful in some cases ...

23,800 views

2022-09-25

Rust for machine learning. It’s useful in some cases for ML, but learn python first. #datascience #codetok #python #machinelearning #rust

#codetok#datascience#machinelearning#microsoft#python#rust

Showing the latent space for stable diffusion. Next v...

23,800 views

2022-09-14

Showing the latent space for stable diffusion. Next video is on explainability. #stablediffusion #datascience #machinelearning #codetok #uma...

#datascience#kind#machinelearning#see#stablediffusion#umap

Other tips I should share? #datascience #timeseries #...

23,700 views

2022-06-12

Other tips I should share? #datascience #timeseries #statistics #dataanalysis #python #codetok #mltok

#codetok#dataanalysis#datascience#python#statistics#timeseries

Post my favorites. If you want more search for Eat Me...

23,500 views

2025-01-16

Post my favorites. If you want more search for Eat Melon rajistics.

#eat#find#like#melon#poop#want#watermelon
Video

Code LLama 70B dropped but we also have some other re...

23,236 views

2024-01-30

Code LLama 70B dropped but we also have some other research on building and using copilots that are were also worthy. Code Llama: https://ai...

#also#building#code#coding#copilot#llama
Video

Cheating has reared its head again over at Kaggle. So...

23,071 views

2023-01-30

Cheating has reared its head again over at Kaggle. Some background for folks on Kaggle and cheating there. #datascience #machinelearning #ka...

#chatgpt#cheating#datascience#enterprise#got#hey#kaggle#machinelearning#ottocompetition#reared

Composer will be sharing their new generative AI mode...

22,900 views

2023-02-26

Composer will be sharing their new generative AI models and they look amazing. They key is they decompose the image, which then provides a l...

#composer#datascience#generativeai#going#machinelearning#stablediffusion
Video

Be skeptical of new models like TimeFM from Google (b...

22,632 views

2024-02-04

Be skeptical of new models like TimeFM from Google (but still listen). For many reasons deep learning models do not work well for time serie...

#deeplearningtechnique#foundation#google#model#series#time#timefm#timeseries#timesfm
Video

An emerging trend of using large language models like...

22,531 views

2023-05-20

An emerging trend of using large language models like GPT-4 for labeling data instead of using humans to annotate data: #datascience #machin...

#alpaca#annotate#annotatingdata#data#datascience#gpt#gpt4#label#labelingdata#machinelearning#using
Video

A little taste of geospatial analytics. Considering s...

22,508 views

2024-02-23

A little taste of geospatial analytics. Considering spatial information can be very valuable for data science and machine learning. It’s g...

#analytics#carto#data#geospatial#h3#hexes#spatial#uber

Challenging the common assumption about normal data d...

22,500 views

2025-02-04

Challenging the common assumption about normal data distributions, rajistics explains that real-world data often exhibits skewness, spikes a...

#assumption#challenging#common#data#distributions#normal#often#spikes

Missing data happens all the time. Don’t just jump to...

22,500 views

2022-11-24

Missing data happens all the time. Don’t just jump to dropping rows or using imputation techniques. #dataengineering #statistics #datascienc...

#data#dataengineering#datascience#imputation#missing#statistics
Video

LayoutLMv3 Training with CORD (receipts) dataset

22,313 views

2022-09-09

LayoutLMv3 Training with CORD (receipts) dataset

#cord#dataset#layoutlmv3#receipts#training
Video

Doing data analysis with large language models like C...

22,252 views

2023-04-25

Doing data analysis with large language models like ChatGPT. It's going to be amazing as these technologies let us combine our data text and...

#apr#bans#chatgpt#data#dataanalysis#datascience#machinelearning#news#openai#roundup#training
Video

Why GPUs from NVIDIA are important for machine learning

22,141 views

2023-06-02

Why GPUs from NVIDIA are important for machine learning

#algebra#datascience#deeplearning#gpus#important#learning#machine#machinelearning#matrix#matrixmultiplication#nvidia
Video

AI news roundup for the week #machinelearning #datasc...

21,860 views

2023-08-26

AI news roundup for the week #machinelearning #datascience #rajistics

#datascience#face#hugging#languagemodels#largelanguagemodels#llama#machinelearning#news#open#production#roundup#series#week
Video

Let's dig into the detail for building your own large...

21,308 views

2023-06-04

Let's dig into the detail for building your own large language model on a custom domain. The LLaVA-Med does a great breakdown of how they bu...

#based#building#custom#datascience#domain#largelanguagemodel#llava#llm#machinelearning#model#pdf#vicuna
Video

Singular Value Decomposition (SVD) Explained

20,750 views

2023-04-05

Singular Value Decomposition (SVD) Explained

#datascience#decomposition#explained#machinelearning#matrices#matrixalgebra#one#singular#singularvaluedecomposition#svd#value#working

Always have a baseline model. For time series, you ca...

20,700 views

2022-10-09

Always have a baseline model. For time series, you can often compare to what happened in a previous time step, like last week. There are err...

#baseline#codetok#datascience#model#statistics#timeseriesforcasting

Graph databases accelerate multi-hop traversals, but ...

20,600 views

2025-08-30

Graph databases accelerate multi-hop traversals, but most production queries are shallow (1–2 hops) that SQL or embeddings handle efficientl...

#accelerate#around#databases#graph#hop#like#multi#need#organizations#queries#sql
Video

Breaking study from Harvard showing the impact of Lar...

20,400 views

2023-09-18

Breaking study from Harvard showing the impact of Large Language Models like GPT-4 on office productivity. #datascience #gpt4 #officeproduct...

#chatgpt#datascience#enjoy#gpt4#officeproductivity#older#papers#productivity#tik#tok#video
Video

OpenAI's turmoil this last week will ensure enterpris...

20,339 views

2023-11-19

OpenAI's turmoil this last week will ensure enterprise AI strategies will not depend on OpenAI. It's clear for any valuable AI systems it's ...

#alternative#copilot#enterprise#get#last#open#openai#opensource#turmoil

#onthisday

20,300 views

2025-04-05

#onthisday

#get#onthisday#poop#reward#watermelon#way
Video

This video is based on work Omar did in tracking down...

20,248 views

2023-06-27

This video is based on work Omar did in tracking down why Falcon was giving results that favored the Middle East. It's an example of how bia...

#biased#datascience#east#falcon#largelanguagemodels#llm#machinelearning#middle#modelbias#omar#towards

Improving your visualizations. Are you happy with you...

20,174 views

2025-01-08

Improving your visualizations. Are you happy with your viz?

#data#groups#hop#improving#like#llm#lot#multi#reasoning#retrieval#use#want
Video

AI Engineer is starting to emerge as a new role. This...

19,998 views

2023-06-30

AI Engineer is starting to emerge as a new role. This role works with LLMs and does prompt engineering and fine tuning of models. They typic...

#aiengineer#datascience#emerging#engineer#llms#machinelearning#new#role#starting#tied
Video

Open is thrown about a lot in the AI community. This ...

19,541 views

2024-02-02

Open is thrown about a lot in the AI community. This week Nomic and Allen AI remind us what it takes to build truly open-source AI models. T...

#language#models#nomic#open#pdf#training
Video

Llama 2 is a worthy successor to Meta's original LLaM...

19,329 views

2023-07-18

Llama 2 is a worthy successor to Meta's original LLaMa model. It performs better -- on par with ChatGPT has a commercial license and and is ...

#explained#llama#meta#model#paper#successor#worthy

Tesla self driving has been such a scam. I am so disa...

19,200 views

2023-01-17

Tesla self driving has been such a scam. I am so disappointed. I really believed that self driving could be pretty useful (I knew it wasn’t ...

#driving#fakedemo#fsd#self#tesla#wasn

A walkthrough of the explainer dashboard. It contains...

19,200 views

2022-11-29

A walkthrough of the explainer dashboard. It contains a lot of the tools you want when trying to explain your models. #datascience #machinel...

#datascience#machinelearning#permutationimportance#statistics#tools#want

The video demonstrates the limitations of LLMs by sho...

19,115 views

2025-02-16

The video demonstrates the limitations of LLMs by showcasing how various real-world AI problems are best solved using traditional machine le...

#demonstrates#group#language#learning#llms#math#models#optimization#policy#problems#relative#teaching#use#video
Video

Everyone is using transformers! Are you working on op...

18,808 views

2023-11-21

Everyone is using transformers! Are you working on optimizing your use? The community has been steadily finding ways to optimize transformer...

#accelerating#community#machinelearning#optimizing#prague#pytorch#transformers#working

The Agony! #datascience #machinelearning #mltok #tec...

18,800 views

2022-04-06

The Agony! #datascience #machinelearning #mltok #techtok #statistics

#agony#datascience#machinelearning#mltok#statistics#techtok
Video

Building a question / answer application using a larg...

18,793 views

2023-05-04

Building a question / answer application using a large language model is a great starter project. You will need to use a vector database and...

#answer#building#datascience#documents#generative#generativeai#largelanguagemodels#machinelearning#project#question#questionanswer#starter#vector#vectordatabase
Video

Evaluation for Large Language Models (LLMs) and Gener...

18,612 views

2023-11-06

Evaluation for Large Language Models (LLMs) and Generative AI - A Deep Dive

#evaluation#generative#influence#language#large#largelanguagemodels#like#llms#logitbias#model#models#openai#temperature#use#words
Video

One of my favorite methods for feature selection is r...

18,531 views

2024-02-18

One of my favorite methods for feature selection is recursive feature elimination. It's very easy to do and a starting data scientist can co...

#datarobot#datascience#feature#featureselection#selection#starting
Video

AMD chips training LLMs running Pytorch 2.0

18,404 views

2023-07-17

AMD chips training LLMs running Pytorch 2.0

#amd#chips#gpus#largelanguagemodels#llms#machinelearning#mosaicmscduet#pytorch#running#training

So what did I miss when you do error analysis? #machi...

18,400 views

2022-12-10

So what did I miss when you do error analysis? #machinelearning #datascience #statistics #erroranalysis

#air#datascience#erroranalysis#machinelearning#model#statistics

Prompt injection attacks are a major security concern...

18,370 views

2025-05-02

Prompt injection attacks are a major security concern when using large language models (LLMs) like ChatGPT. They allow attackers to overwrit...

#agno#attack#attacks#claude#injection#like#llms#major#prompt#python#reasoning#using

Struggling with data validation - let’s dig into how ...

18,312 views

2025-01-19

Struggling with data validation - let’s dig into how Pydantic's type system and validation framework elegantly solves these problems. The vi...

#add#check#data#fun#inputs#introductioin#look#pydantic#using#validating#validation

MuonClip used by Moonshot AI and developed by Keller ...

18,293 views

2025-07-15

MuonClip used by Moonshot AI and developed by Keller Jordan was used during the training of their trillion-parameter Kimi 2 model, addresses...

#kimi#large#llm#moonshot#muon#muonclip#optimizer#training#trillion#used

Curse of dimensionality reminds us to think carefully...

18,100 views

2023-02-11

Curse of dimensionality reminds us to think carefully about feature selection. More isn’t always better. Use a feature selection curve. #dat...

#curseofdimensionality#datascience#features#featureselection#machinelearning#model

Bias in Medical Imaging #datascience #codetok #algori...

18,100 views

2022-05-22

Bias in Medical Imaging #datascience #codetok #algorithmicbias #imaging #machinelearning #bias motivated by the comments from @rajistics

#algorithmicbias#bias#codetok#datascience#different#imaging

Histograms are a great visualization tool. Here are s...

18,000 views

2023-02-10

Histograms are a great visualization tool. Here are some caveats and tips for using histograms. #datascience #statistics #datavisualization ...

#bin#datascience#datavisualization#histogram#size#statistics
Video

Deepmind and OpenAI want everyone to focus on extreme...

17,873 views

2023-05-27

Deepmind and OpenAI want everyone to focus on extreme risks of AI. This helps them hype up AI and make themselves more attractive. The reali...

#datascience#deepmind#extreme#machinelearning#modelbias#modelrisk#models#openai#probably#risks
Video

The skit addresses the challenge of acquiring large v...

17,760 views

2023-12-20

The skit addresses the challenge of acquiring large volumes of labeled data for machine learning projects. The video focuses on using machin...

#data#datalabeling#evaluator#judge#labeling#learning#machine#mlflow#model#using

Replying to @philosophywithsuf explaining the irony f...

17,700 views

2022-12-15

Replying to @philosophywithsuf explaining the irony for pytorch building a graph and the history of tensorflow

#building#graph#irony#know#pytorch#tensorflow

Github copilot alternatives.

17,695 views

2024-09-30

Github copilot alternatives.

#agents#becoming#code#data#dsbench#est#far#github#gpt#look#mais#open#quatre#science#source#use
Video

The New York Times recently filed a lawsuit against O...

17,449 views

2024-01-01

The New York Times recently filed a lawsuit against OpenAI. This is another of many copyright lawsuits against AI companies. While everyone ...

#copyright#disneyplusvoices#many#mickeymouse#newyorktimes#openai

K-Means Flop? Here’s Why—and the Better Ways to Clust...

17,400 views

2025-01-05

K-Means Flop? Here’s Why—and the Better Ways to Cluster Your Data K-means can fall short when data scales, cluster shapes, or dimensionality...

#clusters#data#like#look#means#one

Reply to @milekumulator how I use GitHub #datascience...

17,400 views

2022-05-07

Reply to @milekumulator how I use GitHub #datascience #github #codetok #python #sportsanalytics

#codetok#datascience#github#projects#python#sportsanalytics

Classification outcomes and probabilities #datascienc...

17,400 views

2022-03-02

Classification outcomes and probabilities #datascience #machinelearning #algorithms

#algorithms#classification#datascience#machinelearning#malignant#probabilities
Video

The LocalLlama subreddit received a citation in a rec...

17,063 views

2023-06-28

The LocalLlama subreddit received a citation in a recent paper by Meta. Great reminder of the innovation you can get when models have a larg...

#datascience#innovation#largelanguagemodels#localllama#machinelearning#meta#openai#subreddit
Video

Customer lifetime value is a common data science use ...

17,031 views

2024-03-05

Customer lifetime value is a common data science use case. There are many ways to calculate this but here I show how a data scientist would ...

#customerlifetimevalue#data#datascience#machinelearning#marketinganalyticssummit#rfm
Video

Don't be afraid to challenge established models and a...

16,829 views

2023-12-30

Don't be afraid to challenge established models and assumptions! Often spending time with the data can give you new insights. One common lim...

#data#dataanalysis#datadistributions#models#new#often
Video

Python Optimal Transport is an open source Python lib...

16,808 views

2023-04-23

Python Optimal Transport is an open source Python library providing several solvers for optimization problems related to Optimal Transport f...

#bakeries#better#building#datascience#earthmoversdistance#key#language#large#machinelearning#models#optimization#pythonoptimaltransport#regularization#see#sinkhornknopp
Video

The hardest step is getting ComfyUI running on your c...

16,800 views

2024-02-29

The hardest step is getting ComfyUI running on your computer (you need a GPU). Go do it! Then you can create the coolest images using stable...

#ab_channel#comfyui#get#hardest#images#instant#let#stablediffusion#watch

Explaining how Emily Ocasio won second place with her...

16,700 views

2023-03-29

Explaining how Emily Ocasio won second place with her project analyzing media coverage. I like her approach and highlights a growing trend o...

#data#datascience#emilyocasio#machinelearning#promptengineering#societyforscience
Video

LLMs have a lot of security issues. From prompt injec...

16,462 views

2024-01-18

LLMs have a lot of security issues. From prompt injection attacks extraction of training data data poisoning and even GPU based attacks. How...

#data#datapoisoning#largelanguagemodels#promptinjection#security#training
Video

Time to get started with Generative AI. Look at how C...

16,413 views

2024-01-05

Time to get started with Generative AI. Look at how Cultrix got their model to the top of the OpenLLM Leaderboard. You can do this too! Cult...

#cultrix#fine#generative#mistral#model#tune
Video

Symbolic regression focuses on a mathematical represe...

16,362 views

2023-08-25

Symbolic regression focuses on a mathematical representation of your data. It's helpful in many situations where you need an explainable mod...

#data#datascience#eureqa#machinelearning#mathematical#regression#symbolic#symbolicregression
Video

Soundstorm is a new audio generation model from Googl...

16,247 views

2023-07-17

Soundstorm is a new audio generation model from Google. It can rapidly generate high-quality audio. Google isn't making this model available...

#audio#audioml#datascience#generation#google#machinelearning#model#soundstorm

Interpretable models are often overlooked, but a grea...

16,100 views

2022-11-05

Interpretable models are often overlooked, but a great addition to your data science toolkit. Imodels is a great python package for getting ...

#datascience#imodels#interpretable#interpretablemodels#machinelearning#statistics
Video

Replying to @Sam This video won't be popular but I ha...

16,000 views

2023-05-14

Replying to @Sam This video won't be popular but I have to speak the truth. Meta AI has been really sharing out top notch open source models...

#datascience#google#machinelearning#metaai#open#replying#research#sam
Video

Like beautiful plots of data maps? Check out DataMapP...

15,995 views

2024-01-09

Like beautiful plots of data maps? Check out DataMapPlot from Leland McInnes. To make the best use of this you will need to have your data c...

#bertopic#data#datamap#datamapplot#github#topicclustering
Video

XGBoost 2.0 is out with some great new features inclu...

15,934 views

2024-02-25

XGBoost 2.0 is out with some great new features including support for multi-target trees with vector-leaf outputs and learning to rank probl...

#classification#features#great#learning#models#multi#new#problems#video#xgboost

Quick introduction to optimization and for advanced f...

15,900 views

2022-12-25

Quick introduction to optimization and for advanced folks, go run a notebook from gurobi or do the Kaggle Santa challenge. #datascience #mac...

#datascience#gurobi#machinelearning#optimization#problem#travelingsalesmanproblem

I think langchain is aweome, but the future is an eas...

15,700 views

2023-03-17

I think langchain is aweome, but the future is an easy to use UI. Think Alteryx for LLMs. Langflow is a step in the right direction. #datasc...

#datascience#langchain#langflow#largelanguagemodels#machinelearning#use

New state of the art embedding model, Instructor, for...

15,600 views

2023-01-22

New state of the art embedding model, Instructor, for text is available! It accounts for task and domain when creating an mending. #datascie...

#datascience#embeddings#huggingface#machinelearning#sentencetransformers#word2vec
Video

DINOv2 is a self-supervised machine learning model fo...

15,567 views

2023-12-27

DINOv2 is a self-supervised machine learning model for computer vision. It can be used for a variety of image tasks like image classificatio...

#dinov2#github#image#pdf#post#self

The best way to learning data science is working with...

15,500 views

2022-12-17

The best way to learning data science is working with data. You don’t need to spend money on courses or books. Spending time doing useful pr...

#data#datascience#learning#machinelearning#really#science

Cheating has reared its head again over at Kaggle. So...

15,200 views

2023-01-31

Cheating has reared its head again over at Kaggle. Some background for folks on Kaggle and cheating there. #datascience #machinelearning #ka...

#cheating#datascience#kaggle#machinelearning#medals#ottocompetition

I am haunted by this. Follow for more data science, s...

15,100 views

2024-12-25

I am haunted by this. Follow for more data science, stats, and AI/ML content.

#analysis#data#error#follow#haunted#learning#machine#science#stats#think
Video

Prompt injection attacks are a major security concern...

14,907 views

2023-05-03

Prompt injection attacks are a major security concern when using large language models (LLMs) like ChatGPT. They allow attackers to overwrit...

#attack#attacks#explained#injection#like#llms#major#prompt#security

This video breaks down why large language models can ...

14,716 views

2025-06-15

This video breaks down why large language models can produce different outputs even with the same prompt, seed, and temperature. The culprit...

#different#even#floating#gpus#math#nondeterminism#nonreproducible#point#solution#using
Video

Fun way to talk about K-means algorithm #datascience ...

14,526 views

2022-08-22

Fun way to talk about K-means algorithm #datascience #codetok #analytics #machinelearning

#analytics#codetok#datascience#fun#machinelearning#openai#positive#rate#statistics#students#way

Replying to @Rajiv Shah long version of deep reinfor...

14,300 views

2022-07-15

Replying to @Rajiv Shah long version of deep reinforcement learning video from Week 4

#come#let#like#look#one#right

https://docs.google.com/presentation/d/1HEiuuOCni8Jao...

14,107 views

2025-06-17

https://docs.google.com/presentation/d/1HEiuuOCni8Jao1DNbjxE6VTl9Nxl-q6ObLxto8PdLxc/edit?slide=id.g32c5831c733_0_368#slide=id.g32c5831c733_0...

#agent#agents#demonstration#going#nyc#problems#research#slide#walking#woman#yolo
Video

To dig deeper go watch Sasha Rush's video on alternat...

13,905 views

2023-12-14

To dig deeper go watch Sasha Rush's video on alternatives to attention: https://youtu.be/dKJEpOtVgXc?si=Lx94-51PsjGF-YZT Dig deeper with the...

#alternatives#annotated#attention#deeper#dig#github#lot#mamba#models#really#space#state#transformers

Reply to @canutten1 Deep W with Atari Breakout #datas...

13,900 views

2022-04-18

Reply to @canutten1 Deep W with Atari Breakout #datascience #reinforcementlearning #techtok #machinelearning

#datascience#game#machinelearning#reinforcementlearning#techtok#way

Grok 4 proves that scaling still delivers — trained w...

13,801 views

2025-07-11

Grok 4 proves that scaling still delivers — trained with 100× more compute, it leads on Humanity’s Last Exam, ARC‑AGI, and tool-use benchmar...

#compute#groc#groc4#grok#last#model#scaling#still#use#works

Using LangChain with GPT3. I am seeing lots of cool d...

13,800 views

2023-01-14

Using LangChain with GPT3. I am seeing lots of cool demos based on LangChain and needed to make I covered it. It’s an easy way to take advan...

#datascience#gpt3#langchain#largelanguagemodels#machinelearning#using
Video

When analyzing improvements in AI always take a look ...

13,757 views

2023-11-05

When analyzing improvements in AI always take a look at the ablation studies. An important part is making sure the compute was held the same...

#ablation#ablations#compute#datascience#evaluating#important#improvements#machinelearning#studies#using
Video

Bias in Generative AI. This post is based on a blog p...

13,613 views

2023-05-19

Bias in Generative AI. This post is based on a blog post by text.io on bias in generative AI using an example of job postings. A great remin...

#bias#datascience#generative#generativeai#machinelearning#models#openai
Video

Curse of dimensionality reminds us to think carefully...

13,498 views

2024-02-11

Curse of dimensionality reminds us to think carefully about feature selection. More isn’t always better. Use a feature selection curve. #d...

#aiplanning#curseofdimensionality#datascience#feature#features#featureselection#largelanguagemodels#machinelearning#model#nlp#osu#planning#selection#travelplanner
Video

tiktok e182dd28dcee4103a056c2db6cbed6c7466e6f25

13,494 views

2025-09-06

#don#graphrag#need#probably

Visual Question/Answering with Document AI #datascien...

13,400 views

2022-09-22

Visual Question/Answering with Document AI #datascience #analytics #codetok #huggingface #documentai

#analytics#codetok#datascience#documentai#huggingface#take

DeepSeek hype was getting a bit much. Here is a quick...

13,388 views

2025-01-28

DeepSeek hype was getting a bit much. Here is a quick response.

#actually#bit#china#deepseek#easy#explainer#getting#hype#model#much#one#quick#work
Video

OpenAI announced their new deprecation policy and it'...

13,351 views

2023-07-07

OpenAI announced their new deprecation policy and it's going to affect people who are using OpenAI's models in production. They will have to...

#deprecation#going#largelanguagemodels#machinelearning#models#openai#opensource#policy

7 Baseline Predictive Models Time Series: Previous V...

13,100 views

2025-06-28

7 Baseline Predictive Models Time Series: Previous Value Anomaly: p99 Search: BM25 Recommendation: Popularity Buy recommendations: last vie...

#baseline#models#recommendation#search#use#value

Tensorflow fans are probably seething since they were...

13,100 views

2022-12-10

Tensorflow fans are probably seething since they were first and ignored. All good and will be easy for pytorch users to take advantage of m...

#fans#machinelearning#probably#pytorch#seething#tensorflow
Video

Reasoning and planning with LLMs are difficult as tri...

13,096 views

2024-02-07

Reasoning and planning with LLMs are difficult as trip planning shows. Hopefully now we have a benchmark teams will make progress on this re...

#aiplanning#largelanguagemodels#nlp#osu#planning#travelplanner
Video

Don't let people overlook open source software. It mi...

12,932 views

2024-01-19

Don't let people overlook open source software. It might be free but it's priceless. The Value of Open Source Software at https://papers.ssr...

#ainews#apple#build#datascience#don#explainability#google#machinelearning#meta#nvidia#open#openai#opensource#papers#software#source#synthetic#syntheticdata
Video

Replying to @Rajiv Shah | data science & AI Llama-2 d...

12,800 views

2023-07-22

Replying to @Rajiv Shah | data science & AI Llama-2 deep dive going through the paper by Meta. This is a 10-minute video but it still skips ...

#datascience#largelanguagemodels#llama2#machinelearning#paper#pdf

OpenAI released GPT-4o mini. Let's look at the perfor...

12,800 views

2024-07-19

OpenAI released GPT-4o mini. Let's look at the performance and cost of the model. We also assess how this affects competitors and the contin...

#data#going#model#models#smarter#training
Video

Solid forecasting advice and proved out in the M5 for...

12,705 views

2024-03-03

Solid forecasting advice and proved out in the M5 forecasting competition. Start with simple baselines and statistical approaches and then a...

#approaches#competition#competitors#forecasting#machinelearning#parrotaistyle#simple#start

ChatGPT price drop. Let’s break down how much the pri...

12,700 views

2023-03-02

ChatGPT price drop. Let’s break down how much the price dropped, how OpenAI could drop the price, the effects on performance, what is going ...

#anthropic#chatgpt#datascience#langchain#machinelearning#openai
Video

Diving into how Whisper v3 was trained. OpenAI used a...

12,468 views

2023-11-07

Diving into how Whisper v3 was trained. OpenAI used a combination of weak learning and pseudo-labeling. #whisper #openai #rajistics Whisper:...

#added#deterministic#github#inference#llm#openai#recognition#speech#weak#whisper
Video

Uncensored models are here. Eric Hartford has been bu...

12,376 views

2023-05-25

Uncensored models are here. Eric Hartford has been building the WizardLM series of models and sharing how he has been training the models. T...

#datascience#machinelearning#models#uncensored#uncensoredmodels#wizardlm
Video

LangChain added a new agent Plan and Execute. Looking...

12,375 views

2023-05-14

LangChain added a new agent Plan and Execute. Looking forward to the more advanced use cases people will build with it. This was inspired by...

#agent#agents#datascience#execute#langchain#largelanguagemodels#machinelearning#plan#solve

Saving you a trip to Twitter. #dataengineering #datab...

12,300 views

2022-11-24

Saving you a trip to Twitter. #dataengineering #databases There is one big vendor left out. Probably get sued for leaving them out.

#databases#dataengineering#one#saving#trip#twitter

Earlier videos: @rajistics @rajistics #deeplearning #...

12,300 views

2022-03-04

Earlier videos: @rajistics @rajistics #deeplearning #tensorflow #datascience #analytics

#analytics#datascience#deep#deeplearning#learning#tensorflow
Video

Using agents in langchain with gpt-3. You can do this...

12,291 views

2023-03-04

Using agents in langchain with gpt-3. You can do this! Go check it out. #datascience #machinelearning #openai #gpt3 #langchain

#datascience#datatable#dplyr#gpt3#langchain#machinelearning#openai#pandas#polars#using

Learn Regex, it will pay off #regex #datascience #pro...

12,200 views

2022-03-26

Learn Regex, it will pay off #regex #datascience #programming #analysis

#analysis#datascience#know#one#programming#regex
Video

A new LLM focused on data annotation and labeling bea...

12,056 views

2023-10-19

A new LLM focused on data annotation and labeling beats GPT4. It's built from Llama 13B and will be open source. #datascience #machinelearni...

#annotation#data#datalabeling#datascience#labeling#llm#llms#machinelearning#refuelai#semi#semisupervised#supervised
Video

Mixtral is a new model using a mixture of experts (Mo...

12,020 views

2023-12-09

Mixtral is a new model using a mixture of experts (MoE) approach. It consists of 8x7B mistral models. It was pre-released on Friday look for...

#experts#largelanguagemodels#mistral#mixtral#mixture#model#models

ChatGPT for Robotics is the latest hot paper. Large l...

12,000 views

2023-02-22

ChatGPT for Robotics is the latest hot paper. Large language models are the future interface. #datascience #machinelearning #largelanguagemo...

#chatgpt#datascience#largelanguagemodels#machinelearning#microsoft#robotics

OpenAI AI classifier is a great example to remind peo...

12,000 views

2023-02-04

OpenAI AI classifier is a great example to remind people of the limitations when detecting rare events. It’s not intuitive, so I showed the ...

#didn#openai#people#positive#rate#students

Models that cheat, take shortcuts, and leak informati...

11,900 views

2023-01-03

Models that cheat, take shortcuts, and leak information are all part of the data scientist life style. Ever my data scientist has a story li...

#data#datascience#machinelearning#model#models#scientist
Video

An experiment studying how well GPT4 can plan by usin...

11,845 views

2023-08-13

An experiment studying how well GPT4 can plan by using Block World and Mystery World. #largelanguagemodels #gpt4 #aiplanning #blockworld #my...

#ability#aiplanning#block#blockworld#gpt4#largelanguagemodels#llms#mysteryworld#planning#versus#world

Temperaure is an important parameter when working wit...

11,800 views

2023-03-22

Temperaure is an important parameter when working with many models including got-3. This video gives a short background on temperature and t...

#datascience#gpt3#largelanguagemodels#machinelearning#temperature#want

CLIP Interrogator is available over at the hugging fa...

11,800 views

2022-10-25

CLIP Interrogator is available over at the hugging face spaces. Have fun! #datascience #machinelearning #stablediffusion #huggingface

#datascience#fun#huggingface#machinelearning#stablediffusion#told
Video

What makes GPT-4 so special? One big part is the use ...

11,617 views

2023-07-08

What makes GPT-4 so special? One big part is the use of a Mixture of Experts approach Let's start with how Galton used the wisdom of the cro...

#experts#gpt#largelanguagemodels#machinelearning#mixture#models#one#openai
Video

Code Interpreter is out and it's pretty amazing at fi...

11,500 views

2023-07-12

Code Interpreter is out and it's pretty amazing at first glance. However more experienced software developers and people concerned about dat...

#aiagents#answer#chatgpt#code#codeinterpreter#dataanalysis#interpreter#llama#meta#openai

Software licensing #github #codetok #gpl #programming...

11,500 views

2022-05-10

Software licensing #github #codetok #gpl #programming #python #creativecommons #copyright

#codetok#copyright#github#gpl#programming#want
Video

Japan said it was acceptable to use copyrighted mater...

11,495 views

2023-06-01

Japan said it was acceptable to use copyrighted material such as text and images to train AI. This has the approach of United States and oth...

#copyright#datascience#fairuse#israel#japan#learning#machine#machinelearning#models#training#updates#use
Video

NanoGPT using Simpsons Data: Get Started with Large L...

11,438 views

2023-08-27

NanoGPT using Simpsons Data: Get Started with Large Language Models

#data#datascience#dataset#get#github#largelanguagemodels#machinelearning#nanogpt#nanogpt_simpsons#simpsons#started#using#video

It pays to be organized. Find a friendly data enginee...

11,400 views

2022-10-06

It pays to be organized. Find a friendly data engineer if you need to. #datascience #analytics

#analytics#datascience#find#friendly#organized#pays
Video

Running large language models and transformer models ...

11,301 views

2023-09-05

Running large language models and transformer models locally in web browsers. Lot's of tools for doing this including mlc.ai transformers.js...

#browswerai#language#large#largelanguagemodels#llms#locally#lot#mlcai#models#running#take#transformersjs#web#webgpu
Video

The best way to learning data science is working with...

11,254 views

2023-12-17

The best way to learning data science is working with data. You don’t need to spend money on courses or books. Spending time doing useful ...

#best#caps#data#datascience#different#future#jan#largelanguagemodels#learning#llms#machinelearning#many#market#meta#news#spending#way#yann

Yifan Zhao reverse-engineered Claude Code and uncover...

11,238 views

2025-08-31

Yifan Zhao reverse-engineered Claude Code and uncovered that its secret isn’t hidden logic, but a sophisticated stack of prompts. By interce...

#claude#code#cracking#engineered#learned#lessons#model#prompt#prompts#yifan#zhao

YOLO is a seminal model in object detection for compu...

11,200 views

2024-09-13

YOLO is a seminal model in object detection for computer vision. But what is even more interesting is the principal author, Joseph Redmon an...

#like#model#object#paper#time#yolo

This post was based on great stuff on Twitter, especi...

11,200 views

2022-12-01

This post was based on great stuff on Twitter, especially Ben’s Bites. I wanted to show the chat output, so wasn’t able to keep the original...

#chat#chatgpt#datascience#going#machinelearning#openai

My ranking of the top 26 algorithms for practical dat...

11,130 views

2024-11-30

My ranking of the top 26 algorithms for practical data science, breaking down their strengths, quirks, and when (or if) you should use them....

#algorithms#clustering#data#going#science#top#use

Learn about foundational models, especially in #nlp #...

11,100 views

2022-04-23

Learn about foundational models, especially in #nlp #naturallanguageprocessing #datascience #deeplearning #analytics #techtok #openai

#analytics#datascience#deeplearning#models#naturallanguageprocessing#nlp

Replying to @chokokrem Best machine learning tools fo...

11,000 views

2023-03-10

Replying to @chokokrem Best machine learning tools for competitions. Lots of great stuff here. #datascience #machinelearning #python #codeto...

#codetok#datascience#machinelearning#python#tools#use
Video

Dolly from Databricks is an open source fine tuned in...

10,928 views

2023-04-15

Dolly from Databricks is an open source fine tuned instruction large language model that can be used for commercial uses! Databricks has tak...

#animating#databricks#datascience#dolly#instructionfinetuning#largelanguagemodels#machinelearning#model#open
Video

Replying to XYZ A quick tutorial using WizMap to visu...

10,904 views

2023-07-03

Replying to XYZ A quick tutorial using WizMap to visualize embeddings. The process is extracting your embeddings using dimensionality reduct...

#dimensions#embedding#embeddings#going#lang#using#video#visualize#wizmap

Reminder to visualize your data with one of my favori...

10,900 views

2022-10-29

Reminder to visualize your data with one of my favorites #anscombesquartet #datavisualization #datascience #statistics

#anscombesquartet#data#datascience#datavisualization#statistics#visualize
Video

Reinforcement learning with my Eat Melon! Demo This d...

10,852 views

2023-08-30

Reinforcement learning with my Eat Melon! Demo This demo is based on Karpathy's work. Link: https://bit.ly/raj_eatmelon #datascience #reinfo...

#chatgpt#datascience#demo#eat#learning#machinelearning#melon#reinforcement#reinforcementlearning#rlhf#techtok

Why you should use group partitioning #datascience #m...

10,800 views

2022-06-02

Why you should use group partitioning #datascience #machinelearning #statistics #codetok #deeplearning #andrewng

#andrewng#codetok#datascience#deeplearning#machinelearning#statistics
Video

Histograms are a great visualization tool. Here are s...

10,759 views

2023-02-09

Histograms are a great visualization tool. Here are some caveats and tips for using histograms. #datascience #statistics #datavisualization ...

#acturialscience#autoinsurance#datascience#datavisualization#great#histogram#histograms#insurance#machinelearning#statistics#variables
Video

GPT-4 showing amazing results in casual reasoning. Fo...

10,742 views

2023-05-07

GPT-4 showing amazing results in casual reasoning. For practical purposes experiments are more useful than causal modeling. However this pap...

#amazing#causal#four#gpt#models#reasoning#results#showing#useful

Nat.dev playground is awesome. Should be a great remi...

10,700 views

2023-03-10

Nat.dev playground is awesome. Should be a great reminder of the diversity of large language models. #datascience #machinelearning #largelan...

#datascience#gpt3#largelanguagemodels#machinelearning#models#natdev

Reply to @declinedher being above average. I will add...

10,600 views

2022-02-10

Reply to @declinedher being above average. I will add citation in the comments. #statistics #regressiontothemean #aboveaverage

#aboveaverage#average#regressiontothemean#statistics#think#thought
Video

Feature engineering is an important part of the machi...

10,510 views

2024-02-24

Feature engineering is an important part of the machine learning lifecycle. It’s part art and skill. It takes time to learn and the best d...

#data#engineering#feature#model#part#playground#tensorflow

Replying to @anansaadi OpenAssistant is an open sourc...

10,500 views

2023-02-19

Replying to @anansaadi OpenAssistant is an open source project that aims to provide a chat based assistant that connects to other sources of...

#datascience#feedback#help#information#open#openassistant

How are you using similarity search? #nearestneighbor...

10,500 views

2022-06-26

How are you using similarity search? #nearestneighbor #annoy #spotify #datascience #statistics #codetok #python #similaritysearch

#annoy#data#datascience#nearestneighbor#spotify#statistics

The power of prompting! How to use a general purpose ...

10,453 views

2024-12-01

The power of prompting! How to use a general purpose model to be a special purpose fine tuned model. It’s really important to learn good pro...

#dspy#examples#going#gpt4#largelanguagemodels#model#programming#prompting#purpose#want
Video

Working with small datasets. Several tips including u...

10,401 views

2023-03-06

Working with small datasets. Several tips including using crossvalidation, models like lasso, and running multiple interations with differen...

#crossvalidation#data#datascience#elasticnet#lasso#like#machinelearning#randomseed

#greenscreenvideo Jealous. Go see how bad Meta bungle...

10,400 views

2023-01-28

#greenscreenvideo Jealous. Go see how bad Meta bungled their chatbot @rajistics

#bad#bungled#greenscreenvideo#jealous#meta#see

It’s almost here. Full support for pandas in sklearn ...

10,400 views

2022-10-18

It’s almost here. Full support for pandas in sklearn pipelines. #machinelearning #datascience #codetok #python #sklearn #sci-kit

#codetok#datascience#machinelearning#python#sci#sklearn
Video

OpenAI's new models look great and incorporate the la...

10,373 views

2024-01-28

OpenAI's new models look great and incorporate the latest advances. But don't forget about the open source as well as some tips for thinking...

#embedding#embeddings#leaderboard#models#mteb#new#open#openai#pdf
Video

Accuracy versus Interpretability/Explainability is a ...

10,297 views

2023-08-08

Accuracy versus Interpretability/Explainability is a typical tradeoff in machine learning. Depending on your use case you may favor one over...

#accuracy#dynamics#explainability#interpretability#measuring#pdf#riveter#social#tradeoff#versus
Video

Animated Drawings is really fun model from Meta. It c...

10,276 views

2023-04-14

Animated Drawings is really fun model from Meta. It can take a sketch drawing and then animate it. Great example of combining several image ...

#animateddrawings#datascience#fairanimateddrawings#imageclassification#machinelearning#meta

Updated! I am an idiot - This video explains how Mode...

10,202 views

2025-04-04

Updated! I am an idiot - This video explains how Model Context Protocol (MCP) allows language models like Claude to interact with external t...

#claude#context#explained#like#mcp#model#quickly#server#show#tool

Short summary of my linger video on effieciently trai...

10,200 views

2023-03-27

Short summary of my linger video on effieciently training a latge language model using PEFT and LoRA. #datascience #machinelearning #largela...

#datascience#largelanguagemodels#lora#machinelearning#model#peft
Video

ChatGPT with the Code Interpreter can do a lot of com...

10,199 views

2023-09-29

ChatGPT with the Code Interpreter can do a lot of common data science tasks. We are going to see more tools help with routine data science t...

#chatgpt#codeinterpreter#data#datascience#education#machinelearning#science
Video

When you build a synthetic dataset you know where the...

10,148 views

2024-01-25

When you build a synthetic dataset you know where the noise is and where the signal is. This lets you better assess techniques for feature s...

#build#datascience#explainability#features#know#machinelearning#synthetic#syntheticdata
Video

Some tips for deploying large language models like Ll...

10,129 views

2023-07-30

Some tips for deploying large language models like Llama. Start by building some benchmarks for your tasks to assess how your model performs...

#deploying#deployment#hamel#largelanguagemodels#llm#model#quantization#tips#using
Video

Great week for AI! OpenAI dropped SORA for text to vi...

10,119 views

2024-02-17

Great week for AI! OpenAI dropped SORA for text to video Google with Gemini 1.5 Pro with a longer context length and Meta released V-JEPA wi...

#gemini#google#meta#openai#sora#videos
Video

To build generative AI models like the text-to-SQL sy...

10,104 views

2024-03-25

To build generative AI models like the text-to-SQL system by Snowflake it is important to create a realistic and challenging training datase...

#data#evaluation#generative#lessons#like#medium#model#part#sequel#snowflake#sql#text
Video

Claude 3 and lots of unbelievable claims. Let’s wal...

10,035 views

2024-03-07

Claude 3 and lots of unbelievable claims. Let’s walk through some of the more viral reactions and explain what is going on. We also need t...

#aihype#anthropic#claude#lots#models#status#test#trained#twitter
Video

Thinking about the size of numbers becomes important ...

10,005 views

2023-05-16

Thinking about the size of numbers becomes important when working with neural networks. This video touches about different techniques like u...

#bfloat16#datascience#floating#largelanguagemodels#machinelearning#models#quantization#slimming#techniques
Video

Picking a GPU for deep learning based on Tim Dettmers...

9,997 views

2023-01-16

Picking a GPU for deep learning based on Tim Dettmers classic blog post. #datascience #machinelearning #deeplearning #gpu

#datascience#deep#deeplearning#first#gpt4#gpu#hype#machinelearning#openai#picking#scaling
Video

Direct Preference Optimization is one of the most sig...

9,995 views

2024-01-26

Direct Preference Optimization is one of the most significant advances in AI over the last six months. It provides a simpler and more effici...

#direct#dpo#going#like#model#optimization#preference#trl
Video

Deep dive on how to improve large language models. I ...

9,988 views

2023-04-28

Deep dive on how to improve large language models. I provide an introduction to zero-shot and few-shot learning methods. I also discuss the ...

#automating#datascience#expertise#largelanguagemodels#learning#machine#machinelearning#mlcopilot#peft#rlaif#rlhf#shot#tuning
Video

This is a year old but still holds up pretty well. Th...

9,977 views

2024-03-27

This is a year old but still holds up pretty well. The big difference is you may want to use TRL instead of PEFT for the training. But the c...

#datascience#flant5#largelanguagemodels#lora#machinelearning#model#peft

Reinforcement learning with my Eat Melon! Demo This ...

9,966 views

2024-04-23

Reinforcement learning with my Eat Melon! Demo This demo is based on Karpathy's work. Link: https://bit.ly/raj_eatmelon #datascience #reinf...

#chatgpt#datascience#machinelearning#reinforcementlearning#rlhf#techtok

Llama really upping it on training data. But this is ...

9,950 views

2024-04-26

Llama really upping it on training data. But this is a trend with scaling laws to use more and more training data. #trainingdata #largelang...

#data#enemy#largelanguagemodels#make#training#trainingdata

Parquet and Arrow file formats #datascience #analytic...

9,943 views

2022-05-31

Parquet and Arrow file formats #datascience #analytics #bigdata #codetok #dataengineer

#analytics#bigdata#codetok#dataengineer#datascience#file

How companies your data for training models will be a...

9,883 views

2023-01-20

How companies your data for training models will be a big issue this year. GitHub is being sued for Copilot and Hugging Face has been buildi...

#bigcode#code#copilot#data#github#huggingface
Video

AI works with various data types: tabular unstructure...

9,874 views

2024-03-16

AI works with various data types: tabular unstructured and semi-structured like JSON. While tabular data is most prevalent in enterprises Ge...

#data#json#semi#semistructured#structured#tabular#unstructured

OpenAI plugins! Lets get everyones APIs working with ...

9,862 views

2023-03-24

OpenAI plugins! Lets get everyones APIs working with LLMs! This isa good thing. #largelanguagemodels #langchain #openai #datascience #machin...

#chatgpt#datascience#langchain#largelanguagemodels#like#openai

Reply to @sqwadiladida resources for learning about t...

9,762 views

2022-04-23

Reply to @sqwadiladida resources for learning about transformer models in #naturallanguageprocessing #datascience #techtok #statistics #anal...

#analytics#datascience#naturallanguageprocessing#next#statistics#techtok
Video

#onthisday showing the map of stable diffusion. #data...

9,730 views

2023-09-15

#onthisday showing the map of stable diffusion. #datascience #machinelearning #stablediffusion #rajistics

#chatgpt#datascience#gpt4#machinelearning#map#officeproductivity#onthisday#papers#productivity#showing#stablediffusion

Built on Nobel Prize–winning game theory, SHAP turned...

9,723 views

2025-10-18

Built on Nobel Prize–winning game theory, SHAP turned black-box models into glass boxes. Now with 24K+ GitHub stars, it’s the go-to explaina...

#explainability#game#learning#machine#shap#theory
Video

Claude Skills - A Short Explainer

9,661 views

2025-10-19

Claude Skills - A Short Explainer

#aiplanning#blockworld#claude#explainer#gpt4#largelanguagemodels#mysteryworld#short#skills
Video

Some great tips from Charlie over at Replicate on usi...

9,609 views

2023-08-15

Some great tips from Charlie over at Replicate on using Llama 2. A guide to prompting Llama 2 - https://replicate.com/blog/how-to-prompt-lla...

#charlie#chat#datascience#great#largelanguagemodels#llama#llama2#machinelearning#replicate#tips#using
Video

Do you calibrate your models? For many types of model...

9,495 views

2023-11-26

Do you calibrate your models? For many types of models you may need to calibrate them. This video reminds us of the importance of calibratio...

#calibrate#colorcalibration#datascience#disease#machinelearning#model#models#statistics

Using agents in langchain with gpt-3. You can do this...

9,416 views

2023-03-04

Using agents in langchain with gpt-3. You can do this! Go check it out. #datascience #machinelearning #openai #gpt3 #langchain

#answer#datascience#gpt3#langchain#machinelearning#openai

ColPali: Efficient Document Retrieval with Vision Lan...

9,380 views

2024-10-10

ColPali: Efficient Document Retrieval with Vision Language Models - https://arxiv.org/abs/2407.01449 - https://github.com/illuin-tech/colpal...

#colpali#document#language#patches#retrieval#vision
Video

Human in the loop is important but it's not a silver ...

9,335 views

2023-09-01

Human in the loop is important but it's not a silver bullet. #aiethics #tesla #cigna #rajistics Cigna: https://www.healthcaredive.com/news/c...

#aiethics#cigna#consequences#human#loop#news#tesla#using

My data science setup for now #datascience #codetok #...

9,319 views

2022-08-20

My data science setup for now #datascience #codetok #python #rstats #posit #vscode #googlecolab #digitalocean #conda

#codetok#datascience#google#python#rstats#use

Hugging Face #reinforcementlearning class #datascienc...

9,150 views

2022-04-26

Hugging Face #reinforcementlearning class #datascience #techtok #deeplearning #python

#datascience#deeplearning#hugging#python#reinforcementlearning#techtok
Video

The feature or variables in auto insurance models. Le...

9,127 views

2024-02-13

The feature or variables in auto insurance models. Learn from insurance good features can give you a lot of predictive power. #datascience #...

#acturialscience#autoinsurance#datascience#feature#insurance#machinelearning#variables

Working with small datasets. Several tips including u...

9,102 views

2023-03-07

Working with small datasets. Several tips including using crossvalidation, models like lasso, and running multiple interations with differen...

#crossvalidation#data#elasticnet#going#lasso#like
Video

Humans are Biased: Generative AI is Even Worse

9,090 views

2023-07-27

Humans are Biased: Generative AI is Even Worse

#biased#biasml#datascience#even#generative#generativeai#humans#machinelearning#worse

Reviewing Anthropics latest research and OpenAI conti...

9,040 views

2024-05-23

Reviewing Anthropics latest research and OpenAI continuing to fumble. Mapping the Mind of a Large Language Model: https://www.anthropic.com/...

#anthropic#largelanguagemodels#like#mechanisticinterpretability#model#research

Introducing myself, like a year too late. Hope this f...

8,964 views

2022-11-28

Introducing myself, like a year too late. Hope this fills the gaps around this channel.

#channel#data#let#like#lot#science

I posted this on LinkedIn today, but wanted to share ...

8,896 views

2023-01-27

I posted this on LinkedIn today, but wanted to share here. GTP-3 is powerful, but sometimes domain specific models are going to do better. P...

#chatgpt#datascience#huggingface#machinelearning#model#openai

Why do transformers lock onto “meaningless” tokens li...

8,872 views

2025-10-24

Why do transformers lock onto “meaningless” tokens like [BOS] or punctuation? When one token’s activation spikes thousands of times higher, ...

#attention#compression#layers#llms#massive#sinks#spikes#transformers#two#valleys
Video

Prompt sensitivity is a thing. This video covers how ...

8,830 views

2024-01-11

Prompt sensitivity is a thing. This video covers how changes in formatting the persuasion used in prompts and prompt injection attacks are a...

#anthropic#largelanguagemodels#openai#prompt#prompting#promptinjection

This video explains how Singular Value Decomposition ...

8,712 views

2025-04-04

This video explains how Singular Value Decomposition (SVD) helps AI compress large amounts of data, similar to how a student condenses notes...

#decomposition#matrices#matrixes#singular#svd#using#value
Video

Three major improvements to the transformer architect...

8,702 views

2023-07-29

Three major improvements to the transformer architecture that everyone should know. They include Fast Attention Rotary Positional Embeddings...

#architecture#attention#fast#flashattention#largelanguagemodels#machinelearning#mulitqueryattention#positional#positionalencodings#rotary#transformer
Video

Always have a baseline model. For time series, you ca...

8,644 views

2022-10-09

Always have a baseline model. For time series, you can often compare to what happened in a previous time step, like last week. There are err...

#codetok#datascience#statistics#time#timeseries#timeseriesforcasting
Video

Reinforcement Learning with AI Feedback (RLAIF) is an...

8,618 views

2023-09-07

Reinforcement Learning with AI Feedback (RLAIF) is an emerging approach to replace Reinforcement Learning with Human Feedback (RLHF). It wor...

#chatgpt#feedback#language#large#largelanguagemodels#learning#reinforcement#reinforcementlearning#rlaif#rlhf
Video

State-of-the-art results (100%!!) on widely used acad...

8,607 views

2023-09-25

State-of-the-art results (100%!!) on widely used academic benchmarks (MMLU GSM8K HumanEval OpenbookQA ARC Challenge etc.). The model called ...

#ctnl#datascience#gives#largelanguagemodels#machinelearning#model#parody#pdf#phi#phictnl#state

Reply to @midnightlibrarian #datasaurus #stats #dinos...

8,570 views

2022-01-16

Reply to @midnightlibrarian #datasaurus #stats #dinosaur #analytics explaining #anscombe

#analytics#anscombe#data#datasaurus#dinosaur#stats

AI News Update: Meta - Major spending on AI and conti...

8,449 views

2024-01-20

AI News Update: Meta - Major spending on AI and continuing lawsuits over child safety NVIDIA - Stock at high, 1.5 Trillion market cap Google...

#ainews#applemusic#google#meta#nvidia#openai

The curse of dimensionality occurs when adding more f...

8,448 views

2025-02-11

The curse of dimensionality occurs when adding more features to a model leads to decreased performance due to increased noise and complexity...

#adding#curse#dimensionality#features#model#performance

Some of my favorite machine learning visualizations. ...

8,389 views

2024-05-05

Some of my favorite machine learning visualizations. Check them out to better understand how these algorithms work. If you work closely with...

#algorithms#machine#machinelearning#tools#understand#visualizations
Video

Automating machine learning with Large Language Model...

8,388 views

2023-05-01

Automating machine learning with Large Language Models (LLMs). While it's possible to ask ChatGPT to provide code for building a prediction ...

#automl#datascience#deplot#knowledge#language#largelanguagemodels#learning#llms#machinelearning#mlcopilot#model#one#reasoning#shot#visual

Videos with stable diffusion #datascience #machinelea...

8,327 views

2022-09-07

Videos with stable diffusion #datascience #machinelearning #stablediffusion #codetok

#datascience#going#images#machinelearning#stablediffusion#video
Video

NanoGPT is a simple fast repository for training/fine...

8,320 views

2023-08-20

NanoGPT is a simple fast repository for training/finetuning medium-sized GPTs. I recommend it for getting a deeper understanding of large la...

#announced#datascience#face#github#gpt4#hugging#huggingface#largelanguagemodels#machinelearning#meets#nanogpt#nanogpt_simpsons

Back! Time for AI on images. #datascience #computer...

8,284 views

2022-07-12

Back! Time for AI on images. #datascience #computervision #objectdetection #yolo #machinelearning #codetok

#codetok#computervision#datascience#machinelearning#objectdetection#yolo

Is RAG Actually Broken? A recent “semantic collapse” ...

8,280 views

2026-01-01

Is RAG Actually Broken? A recent “semantic collapse” claim argues that embeddings fail at scale because distances compress in high dimension...

#actually#embeddings#rag#real#systems#work

Try out these examples for yourself and lots more ava...

8,266 views

2023-01-31

Try out these examples for yourself and lots more available. It’s scary cool how these models are working. #datascience #machinelearning #gp...

#datascience#machinelearning#model#models#problem#solve
Video

Using scaling laws to help us getter smaller models w...

8,252 views

2023-04-13

Using scaling laws to help us getter smaller models with the same accuracy! Based on blog post by de Vries. #datascience #machinelearning #l...

#accurate#chinchilla#compute#datascience#largelanguagemodels#laws#machinelearning#model#models#scaling#scalinglaws#smaller#still#using
Video

Non-deterministic LLM inference is a deal.OpenAI has ...

8,182 views

2023-11-14

Non-deterministic LLM inference is a deal.OpenAI has started offering it hoping the rest of the providers will also offer it for enterprise ...

#determinism#largelanguagemodels#learning#model#models#non#nondeterministic#openai#results#training#using#weak#whisper

It’s happened! Time series #datascience #timeseries ...

8,143 views

2022-03-19

It’s happened! Time series #datascience #timeseries #analytics #statistics

#analytics#datascience#happened#statistics#time#timeseries

Meta’s Cicero for playing Diplomacy is impressive and...

8,137 views

2022-11-23

Meta’s Cicero for playing Diplomacy is impressive and a bit scary. #statistics #datascience #machinelearning #diplomacy

#datascience#diplomacy#machinelearning#play#statistics#trained

SKlearn Playground #datascience #machinelearning #sta...

8,137 views

2022-04-12

SKlearn Playground #datascience #machinelearning #statistics #techtok #sklearn

#datascience#decision#machinelearning#sklearn#statistics#techtok

Llama 3 the beginning of the end? Or will GPT5 up-end...

8,048 views

2024-04-21

Llama 3 the beginning of the end? Or will GPT5 up-end everything (they have had over a year)? A skit based on a thread by Carmen Gutierrez o...

#agi#going#llama3#meta#open#openai

I closely monitor technology trends in AI. Following ...

8,044 views

2024-05-04

I closely monitor technology trends in AI. Following huge developments at the end of 2022 and throughout 2023, the Generative AI space is no...

#apply#generative#like#plateau#problems#technology
Video

Language Models like ChatGPT can be modified by sever...

7,674 views

2023-04-08

Language Models like ChatGPT can be modified by several methods including Prompting Instruction Fine-Tuning and Reinforcement Learning with ...

#chatgpt#datascience#large#largelanguagemodels#like#machinelearning#many#model#modifying#openai#rlhf#train#ways

Applying reinforcement learning to teaching AI math. ...

7,666 views

2025-02-11

Applying reinforcement learning to teaching AI math. This is based off a notebook using Group Relative Policy Optimization (GRPO) on a QWEN ...

#deepseek#examples#get#grpo#math#model#notebook#qwen#reinforcement#reward#started#using
Video

Meta released Llama Guard for content moderation. It ...

7,660 views

2023-12-08

Meta released Llama Guard for content moderation. It looks to be effective and very adaptable. This is part of their Purple Llama project ar...

#content#contentmoderation#guard#largelanguagemodels#llama#llamaguard#meta#model
Video

Spaces gives you great interactive demos of many popu...

7,645 views

2023-04-24

Spaces gives you great interactive demos of many popular sklearn examples. It's a great place to browse and even contribute back by add more...

#datascience#docs#documentation#examples#face#hugging#huggingface#interactive#machinelearning#sklearn#spaces
Video

Models and datasets have specific definitions. Models...

7,626 views

2023-04-29

Models and datasets have specific definitions. Models consist of at least two licenses nowadays this has been an issue for LLaMA where the c...

#content#copyright#datascience#dataset#datasets#defining#laoin#learning#llama#machine#machinelearning#models

The skit explains a dynamic pricing strategy that use...

7,617 views

2025-07-19

The skit explains a dynamic pricing strategy that uses machine learning to adjust prices based on what customers are willing to pay, rather ...

#advantages#data#deep#learning#mobile#pay#prices#pricing#understanding#willing
Video

Breaking News: Executive Order on AI Quick video on t...

7,591 views

2023-11-01

Breaking News: Executive Order on AI Quick video on the main issues there is a lot more in the Order. It is over a 100 pages. #executiveorde...

#breaking#executive#executiveorderai#means#meta#news#openai#order
Video

Based on:

7,567 views

2023-01-27

Based on:

#based

You know your transformer basics? Let's go over Enco...

7,563 views

2024-12-29

You know your transformer basics? Let's go over Encoder, Encoder-Decoder, and Decoder only models. If you want to dig deeper into the trans...

#decoder#embedding#encoder#encoders#fundamentals#models#text#transformer#use
Video

Context length has grown in importance for large lang...

7,549 views

2023-07-02

Context length has grown in importance for large language models. A longer context length lets you pass more information to the model effect...

#aiengineer#anthropic#context#contextlength#largelanguagemodels#length#lengths#llms#longchat#openai

Entropy can be a useful measure in machine learning. ...

7,477 views

2024-04-27

Entropy can be a useful measure in machine learning. Entropy and information gain is used in building decision trees. I have also seen entro...

#credit#entropy#know#liability#look#rating

Audio spectrogram transformer shows how widely we can...

7,438 views

2022-12-19

Audio spectrogram transformer shows how widely we can use #machinelearning #datascience #mlaudio #deeplearning

#datascience#deeplearning#machine#machinelearning#mlaudio#use

Prompt engineering helped optimize model behavior whe...

7,407 views

2025-07-04

Prompt engineering helped optimize model behavior when LLMs were less capable. But as models have improved, gains from prompt tweaks have di...

#context#engineering#explained#get#going#gonna#like#model#twenty

Let's talk about using H3 for geospatial analytics

7,394 views

2025-02-25

Let's talk about using H3 for geospatial analytics

#analytics#basketball#bringing#geospatial#hexes#let#moving#pose#spatial#sticks#talk#uber#use#using

Lets talk about why enterprises are considering alter...

7,393 views

2023-03-18

Lets talk about why enterprises are considering alternatives to chatGPT by looking to open source. An open source strategy can affect lots o...

#api#chatgpt#data#know#open#openai

I have a lot more tea #datarobot #corporategreed #dat...

7,356 views

2022-06-22

I have a lot more tea #datarobot #corporategreed #datascience #codetok #techtok

#able#codetok#company#corporategreed#datarobot#datascience
Video

Anscombe's quartet” and the “datasaurus dozen” ...

7,344 views

2024-01-14

Anscombe's quartet” and the “datasaurus dozen” remind us of the importance of visualizing data.

#anscombe#datasaurus#dozen#importance#quartet#remind

Automatic Speech recognition in 3 lines of code using...

7,335 views

2022-11-17

Automatic Speech recognition in 3 lines of code using wav2vec2 in transformers #datascience #machinelearning #huggingface #automaticspeechre...

#asr#automatic#automaticspeechrecognition#datascience#huggingface#machinelearning
Video

If an AI story looks too good approach it critically....

7,306 views

2023-07-20

If an AI story looks too good approach it critically. This week there are three examples with GPT-4 gzip and Llama where AI influencers jump...

#datascience#gpt4#gzip#llama#machinelearning#taylorswift

Word as Image - great use of generative AI models lik...

7,294 views

2023-03-07

Word as Image - great use of generative AI models like stable diffusion to create fonts. Check out the paper at wordasimage.github.io #datas...

#datascience#fonts#generativeai#gonna#machinelearning#stablediffusion
Video

Retrieval Augmented approaches are a great way to imp...

7,270 views

2023-04-04

Retrieval Augmented approaches are a great way to improve your LLMs. Deepset shown in this video provides a set of tools but there are many ...

#augmented#chatgpt#datascience#deepset#language#large#largelanguagemodels#llamaindex#machinelearning#models#retrieval#retrievalaugmented#retrievalaugmentedmodel
Video

MiniGPT-4 brings us a multimodal model! It consists o...

7,241 views

2023-04-17

MiniGPT-4 brings us a multimodal model! It consists of a vision encoder with a pretrained ViT and and an advanced Vicuna large language mode...

#datascience#gpt#gpt4#handling#images#machinelearning#minigpt#minigpt4#model#multimodal#text#vision
Video

Target leakage in the CrowdAI dataset. Target leakage...

7,231 views

2023-04-10

Target leakage in the CrowdAI dataset. Target leakage is a very common problem and everyone should understand it. I have seen even the smart...

#crowdai#data#dataleakage#datascience#leakage#machinelearning#sarcos#target#targetleakage
Video

Trying to talk about AGI in a reasonable manner. Ther...

7,197 views

2023-11-09

Trying to talk about AGI in a reasonable manner. There needs to be more hype and more rigor in talking about AGI. The Deepmind paper provide...

#agi#deepmind#google#levels#machinelearning#pdf#talk#trying

Open Source with Stable Diffusion - #datascience #cod...

7,176 views

2022-08-27

Open Source with Stable Diffusion - #datascience #codetok #machinelearning #stablediffusion #opensourcesoftware

#codetok#datascience#machinelearning#opensourcesoftware#software#stablediffusion

Mechanistic interpretability hands on! Try Monitor: h...

7,141 views

2024-12-04

Mechanistic interpretability hands on! Try Monitor: https://monitor.transluce.org/dashboard Monitor writeup: https://transluce.org/observabi...

#biblical#interpretability#mechanistic#model#monitor#predictions#steer#use#using#verses
Video

Statistics sounds heavy but a lot of concepts are ver...

7,136 views

2024-01-07

Statistics sounds heavy but a lot of concepts are very useful and can save you a lot of effort. This video is reminder of the many ways we u...

#concepts#done#heavy#lot#random#sounds#statistics#teammates#teamwork#weaponizedincompetence#working#would
Video

Anomaly detection is hard. This is an introduction to...

7,125 views

2023-09-26

Anomaly detection is hard. This is an introduction to anomaly detection algorithms. The video focuses on the results for ADBench and what da...

#adbench#analytics#anomaly#anomalydetection#codetok#datascience#detection#results

Replying to @rajistics as promised, the feature or va...

7,107 views

2023-02-12

Replying to @rajistics as promised, the feature or variables in auto insurance models. Keep the feedback coming. #datascience #machinelearni...

#acturialscience#autoinsurance#datascience#insurance#machinelearning#variables
Video

DePlot translates plots into readable tables that an ...

7,104 views

2023-05-03

DePlot translates plots into readable tables that an LLM can query. It's based on the MatCha architecture with more fine-tuning on plots. Ni...

#datascience#deplot#documentai#machinelearning#matcha#visualreasoning
Video

ImageBind the first AI model capable of binding data ...

7,095 views

2023-05-11

ImageBind the first AI model capable of binding data from six modalities at once without the need for explicit supervision. It recognizes th...

#datascience#embeddings#imagebind#machinelearning#modalities#model#multimodal

Good reminder on what an open source model has now th...

7,094 views

2025-02-01

Good reminder on what an open source model has now that we are all talking about DeepSeek.

#data#good#model#models#open#reminder#source#talking#training
Video

Prediction Intervals with Conformal Inference: An Int...

7,090 views

2022-09-24

Prediction Intervals with Conformal Inference: An Intuitive Explanation

#conformal#conformalprediction#datascience#explanation#getting#inference#intervals#intuitive#prediction#predictioninterval#statistics
Video

Active learning uses an algorithm to help select what...

7,081 views

2023-05-18

Active learning uses an algorithm to help select what data to label. Ideally using this approach people can get comparable model results usi...

#active#activelearning#boundary#data#datalabeling#datascience#decision#labeling#learning#machine#machinelearning#using

Training whisper model

7,064 views

2024-11-07

Training whisper model

#arnold#data#hours#kan#kolmogorov#labeled#model#networks#open#shept#source#training#using#version#whisper

The politics of ChatGPT, it’s no different than any o...

7,028 views

2022-12-27

The politics of ChatGPT, it’s no different than any other technology and is not neutral. If you want a simple explanation of how ChatGTP wor...

#chatgpt#chatgtp#datascience#machinelearning#models#openai

Replying to @Data Storyteller Here are two examples ...

7,026 views

2022-07-22

Replying to @Data Storyteller Here are two examples of data or target leakage. I bet people have other fun examples. #datascience #targetle...

#dataleakage#datascience#machinelearning#model#target#targetleakage
Video

If you want more details on the biggest advancements ...

6,983 views

2023-12-22

If you want more details on the biggest advancements in AI for 2023 then find me on LinkedIn or Threads where I have a detailed post with al...

#2023#advancements#ai#also#biggest#data#details#like#models#one#practical#predictions#things#top#want

Feature engineering and data preprocessing are an imp...

6,980 views

2023-02-27

Feature engineering and data preprocessing are an important part of the machine learning process. #datascience #machinelearning #featureengi...

#datascience#engineering#feature#featureengineering#machinelearning#model
Video

Working with Categorical data using ordinal one hot (...

6,970 views

2023-12-01

Working with Categorical data using ordinal one hot (dummy) and target encoding. Do you have your own favorite approach? And ChatGPT tells m...

#analytics#categorical#data#datascience#featureengineering#statistics#working

It happens. Be careful. #aws #datascience #deeplearni...

6,940 views

2022-03-30

It happens. Be careful. #aws #datascience #deeplearning #gpu

#aws#careful#datascience#deeplearning#gpu#happens

Cursor’s new Tab-RL model uses reinforcement learning...

6,923 views

2025-09-13

Cursor’s new Tab-RL model uses reinforcement learning from real user feedback, rolling out checkpoints multiple times per day across 400M+ p...

#copilot#cursor#feedback#learning#model#new#reinforcement#tab#using

TabPFN revolution in data science. Please don’t your ...

6,898 views

2022-10-22

TabPFN revolution in data science. Please don’t your time on all this hype. Every week there is a revolution announced on Twitter. Ignore it...

#data#datascience#machinelearning#science#statistics#tabpfn

My creator hero just released a great new book and we...

6,858 views

2024-09-07

My creator hero just released a great new book and website. It's an excellent way to learn programming using JavaScript and build some very ...

#code#coding#daniel#like#nature#new

Parquet file format - Are you using it? For data scie...

6,857 views

2024-12-15

Parquet file format - Are you using it? For data science, data engineering, and machine learning its a popular file format.

#data#engineering#everything#file#format#introducing#okay#parquet

Requested video - DSPy DSPy brings a systematic appro...

6,848 views

2024-06-15

Requested video - DSPy DSPy brings a systematic approach to prompting that gives you better-designed workflows while also optimizing prompts...

#approach#dspy#like#looks#prompting#using

Should you take the time to learn Kubernetes as a dat...

6,839 views

2023-01-23

Should you take the time to learn Kubernetes as a data scientist? Or you already overloaded learning data science? #datascience #machinelear...

#data#datascience#don#kubernetes#learn#machinelearning
Video

Do you have a missing data story? Missing data happen...

6,829 views

2023-11-24

Do you have a missing data story? Missing data happens all the time. Should you just accept it? Drop rows? Use Imputation? or Keep digging? ...

#data#dataengineering#datascience#drop#imputation#missing#overcome#rows#skit#statistics
Video

Axolotl provides a declarative approach to fine tunin...

6,798 views

2024-01-22

Axolotl provides a declarative approach to fine tuning large language models. It's very easy to get started with and much easier for folks n...

#approach#axolotl#declarative#fine#large#largelanguagemodels#provides#tuning

Data visualization tips #datascience #dataviz #analyt...

6,784 views

2022-03-21

Data visualization tips #datascience #dataviz #analytics #datavisualization

#analytics#data#datascience#datavisualization#dataviz#let

Deepseek R1 - Latest open weights reasoning model, th...

6,756 views

2025-01-21

Deepseek R1 - Latest open weights reasoning model, the paper is very readable https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_...

#deepseek#hard#like#model#open#step#think
Video

Deep dive video on using explanations that could out ...

6,747 views

2024-01-31

Deep dive video on using explanations that could out of large language models. This is something that is understudied but I find it quite us...

#deep#dive#explain#explanations#language#large#llms#models#predictions#using#video

Synthetic datasets have given me a way to better unde...

6,739 views

2023-01-25

Synthetic datasets have given me a way to better understand how to do feature selection and model explainability. Try it out sometime. #data...

#data#datascience#explainability#features#machinelearning#syntheticdata

This video had aged well. Models are very useful and ...

6,737 views

2024-05-21

This video had aged well. Models are very useful and widely used for labeling data and generating data.

#data#good#gpt#labeling#used#well

Replika and the growth of these character chatbots or...

6,722 views

2023-02-14

Replika and the growth of these character chatbots or socialbots is emerging as a big use case within generative AI. Here is a recent contro...

#chatbot#datascience#gpt3#people#replika#socialbot

My favorite ML visualizations. Will post on Reddit wi...

6,716 views

2025-05-05

My favorite ML visualizations. Will post on Reddit with links.

#algorithms#data#improve#learning#machine#tips#tools#understand#using#visualization#visualizations

Replying to @jbfjhcfv plotly is a great package for f...

6,714 views

2022-10-05

Replying to @jbfjhcfv plotly is a great package for folks using R or Python. It’s open source, so anyone can use it. #datascience #visualiza...

#analytics#datascience#plotly#python#rstats#visualization
Video

Best practices for prompting is emerging. A couple of...

6,683 views

2023-04-30

Best practices for prompting is emerging. A couple of simple rules is starting with a API based LLM and focus on building good prompts. This...

#analysis#chatgpt#data#datascience#largelanguagemodels#llms#machinelearning#nlp#openai#preview#promptengineering#talk#ted
Video

Using LangChain with GPT3. I am seeing lots of cool d...

6,679 views

2023-01-14

Using LangChain with GPT3. I am seeing lots of cool demos based on LangChain and needed to make I covered it. It’s an easy way to take adv...

#datascience#gpt3#know#langchain#largelanguagemodels#machinelearning#math#services#using#weather
Video

Deciding whether to use a Large Language Model or a s...

6,656 views

2023-06-02

Deciding whether to use a Large Language Model or a smaller model? This video explores the tradeoffs between both approaches based on the la...

#bert#datascience#deciding#gpt#language#large#largelanguagemodels#like#machinelearning#model#models#reviewing#see#smaller#tradeoffs#using
Video

Three new multimodal models this week but only one re...

6,653 views

2023-10-05

Three new multimodal models this week but only one respects data scientists. Once again it's Meta doing it right. #machinelearning #multimod...

#benchmarks#gpt#ision#machinelearning#meta#multimodal#openai#pdf#reka#rekaai

In this video, I cover how researchers from Alibaba u...

6,623 views

2025-06-30

In this video, I cover how researchers from Alibaba used supervised fine-tuning and reinforcement learning (GRPO) to improve workflow genera...

#beating#better#comfyui#fine#gpt#grpo#learning#model#reinforcement#tuning

Speculating on GPT-4 size and performance. #datascien...

6,598 views

2023-02-21

Speculating on GPT-4 size and performance. #datascience #machinelearning #gpt3 #gpt4 #openai see scaling law video: @rajistics

#datascience#gpt#gpt3#gpt4#machinelearning#openai

Quick intro to spacy, which is a standard tool for pe...

6,543 views

2022-10-08

Quick intro to spacy, which is a standard tool for people doing natural language processing #nlp or text analytics. Not my best video, buts ...

#analytics#codetok#datascience#nlp#python#spacey
Video

Hugging Face announced a new valuation of $4.5 billio...

6,507 views

2023-08-24

Hugging Face announced a new valuation of $4.5 billion! #datascience #machinelearning #huggingface

#announced#considerations#datascience#deployment#face#hugging#huggingface#language#large#latency#llama#machinelearning#meta#model#models#paper#successor#worthy
Video

Segment Anything (Meta's Segmentation Model)

6,499 views

2023-04-06

Segment Anything (Meta's Segmentation Model)

#anything#baseline#baselinemodel#benchmarkdataset#code#datascience#fun#look#lot#machinelearning#meta#model#practicaldatascience#recommender#segment#segmentation#twitter
Video

Beam search is an alternative way for LLMs to generat...

6,429 views

2024-03-30

Beam search is an alternative way for LLMs to generate text. Let's walk through how beam search compares to greedy search. Alternatives incl...

#beam#beamsearch#different#generate#largelanguagemodels#like#search#textgeneration

Replying to @petererickson.art This was tough, a lot ...

6,417 views

2022-09-10

Replying to @petererickson.art This was tough, a lot of ground to cover. Let me know what I messed up on. I also have related videos on embe...

#information#latent#let#numbers#see#space

4 ways to do Dimensionality Reduction - PCA, Autoenco...

6,394 views

2024-11-10

4 ways to do Dimensionality Reduction - PCA, Autoencoders, TSNE, and UMAP Lots of reasons to do dimensionality reduction - you want to comp...

#autoencoder#cho#data#dimensionality#lot#nhi#pca#quan#reduction#techniques#tsne#tuy#umap#want

What kind are you? #datascience #statistics #python ...

6,376 views

2022-06-24

What kind are you? #datascience #statistics #python #codetok #mltok #practicaldatascience

#codetok#datascience#mltok#practicaldatascience#python#statistics

Amazon shared a new dataset with human-written long-f...

6,336 views

2025-01-13

Amazon shared a new dataset with human-written long-form answers across 7 domains for assessing LLM performance in retrieval-augmented QA. I...

#answers#arena#dataset#different#evaluate#evaluating#human#rag#using#written
Video

Examining the data used for training our our LLMs. Op...

6,307 views

2023-04-20

Examining the data used for training our our LLMs. OpenAI is running into trouble in Europe since it won't disclose exactly what was used fo...

#best#data#don#engineer#flashback#know#let#like#llama#onthisday#openai#post#practices#reddit#redpajama#satire#together

Roundup of this weeks news, let me know if you all li...

6,295 views

2023-02-10

Roundup of this weeks news, let me know if you all like this format. I had a lot of fun making this. #datascience #machinelearning #dumbtech...

#datascience#dumbtechnews#google#machinelearning#microsoft#openai

Random forests and their ease of use are important in...

6,244 views

2023-02-18

Random forests and their ease of use are important in understanding modern data science. #datascience #machinelearning #statistics #randomfo...

#data#datascience#fortran#machinelearning#randomforest#statistics

What if you could build a research assistant that bea...

6,229 views

2025-11-21

What if you could build a research assistant that beat top models for $0.01 a query? Meet DR Tulu-8B, powered by a training method called RL...

#assistant#aware#beat#build#deep#model#models#open#quantization#research#rubrics#top#training

Reply to @bosstoastmaker tensorflow playground data ...

6,213 views

2022-02-05

Reply to @bosstoastmaker tensorflow playground data > models #featureengineering #datascience #tensorflow #deeplearning #analytics #ai

#analytics#data#datascience#deeplearning#featureengineering#tensorflow
Video

Getting the best distance metric is crucial for solvi...

6,179 views

2023-10-15

Getting the best distance metric is crucial for solving analytical problems. This video reviews Euclidean Manhattan Mahabolobis Levenshtein ...

#best#data#datascience#distance#distancemetrics#euclidean#getting#machinelearning#manhattan#metrics#science
Video

Repost but scaling laws are still very important. Sca...

6,173 views

2024-01-09

Repost but scaling laws are still very important. Scaling laws help us figure out how to manage the amount of training data versus the model...

#concepts#datascience#deepmind#heavy#largelanguagemodels#lot#machinelearning#microsoft#nvidia#openai#random#sounds#statistics
Video

How LLMs memorize information! Check out the Starcode...

6,162 views

2023-10-17

How LLMs memorize information! Check out the Starcoder Memorization space by Mithril Security and the notebook so you can look for LLM memor...

#data#information#language#large#largelanguagemodels#llms#memorization#memorize#mithril#mithrilsecurity#models#notebook#security#starcoder#training

My second try to explain in context learning or few s...

6,160 views

2023-01-28

My second try to explain in context learning or few shot learning with large language models. It’s very cool and why these models are so ex...

#datascience#gpt3#learning#machinelearning#model#models
Video

Making efficient use of GPU Memory when training tran...

6,149 views

2023-06-07

Making efficient use of GPU Memory when training transformer models. This video covers the Kernel Overhead Optimizer states Activation memor...

#datascience#deeplearning#going#gpu#huggingface#machinelearning#memory#nvidia#transformers#use

Meta’s less than open source model and some bad takes...

6,125 views

2023-03-05

Meta’s less than open source model and some bad takes from Twitter. #datascience #machinelearning #largelanguagemodels #opensource #meta

#datascience#largelanguagemodels#machinelearning#meta#models#open
Video

The current FTC leader Khan is willing to confront la...

6,124 views

2023-07-14

The current FTC leader Khan is willing to confront large tech companies about uncompetitive practices. There is a long history of abuses by ...

#businesses#companies#current#ftc#khan#leader#tech#technologyregulation#would

This will be fun! #python #codetok #datascience #prog...

6,046 views

2022-05-02

This will be fun! #python #codetok #datascience #programming

#codetok#datascience#fun#good#programming#python
Video

AI News (Oct 29th 2023) with a focus on AI for persua...

6,009 views

2023-10-29

AI News (Oct 29th 2023) with a focus on AI for persuasion. #openai #meta #google #rajistics Open AI on super persuasion: https://twitter.com...

#actors#emotions#google#lawsuits#meta#open#openai#persuasion
Video

Why Feature Engineering is a an important skill for d...

6,006 views

2024-02-24

Why Feature Engineering is a an important skill for data science and machine learning

#data#engineering#feature#history#important#learned#make#older#science#since#skill#video

Google announced Bard, but we still don’t know much. ...

6,006 views

2023-02-06

Google announced Bard, but we still don’t know much. It has been based on Lambda which has been around for a while. This is a safe bet, not ...

#announced#chatgpt#datascience#google#largelanguagemodels#machinelearning

Zero-shot object detection. #datascience #codetok #hu...

5,968 views

2022-08-09

Zero-shot object detection. #datascience #codetok #huggingface #objectdetection #deeplearning #zeroshotclassification

#codetok#datascience#deeplearning#huggingface#images#objectdetection

Never underestimate the power of the status quo #data...

5,961 views

2022-06-21

Never underestimate the power of the status quo #datascience #forecasting #statistics #SAS #python #codetok

#codetok#datascience#forecasting#python#sas#statistics

Visualizing decision trees with dtreeviz. #datascienc...

5,948 views

2022-12-28

Visualizing decision trees with dtreeviz. #datascience #machinelearning check out their GitHub and it’s pip install dtreeviz

#datascience#decision#dtreeviz#machinelearning#see#tree
Video

Waterfall charts that show your progress as well as e...

5,930 views

2023-11-14

Waterfall charts that show your progress as well as explaining the error! This is what I like to see when I see a visualization of model res...

#charts#datascience#datavisualizations#like#machinelearning#model#modelreview#openai#results#see#use#visualizing#waterfall

Sports! #datascience #analytics #codetok #machinelear...

5,924 views

2022-08-16

Sports! #datascience #analytics #codetok #machinelearning #rstats #footballanalytics #statistics

#analytics#codetok#data#datascience#machinelearning#rstats

Do it! Get a server in the cloud. Build your skills....

5,900 views

2022-03-27

Do it! Get a server in the cloud. Build your skills. #datascience #programming #analytics #digitalocean

#analytics#datascience#digitalocean#get#programming#server
Video

Evaluation of Large Language Models is a critical top...

5,875 views

2023-08-28

Evaluation of Large Language Models is a critical topic. Leaderboards provide little guidance for evaluation but have many flaws. I am very ...

#evaluatingllms#evaluation#issues#large#largelanguagemodels#leaderboards#llm#models#people#semianalysis#topic

Week 1. #reinforcementlearning #huggingface #datascie...

5,860 views

2022-05-12

Week 1. #reinforcementlearning #huggingface #datascience #python #codetok #programming

#codetok#datascience#huggingface#programming#python#reinforcementlearning

Corporate research labs have changed academic work wi...

5,831 views

2023-01-28

Corporate research labs have changed academic work with their reluctance to provide reproducible research and getting around blind peer revi...

#datascience#hey#machinelearning#neurips#reproducibility#review

Galactica by meta. Cool model, poor form on sharing i...

5,767 views

2022-11-17

Galactica by meta. Cool model, poor form on sharing it out. #datascience #machinelearning I feel for students, it was going to write a lot ...

#datascience#galactica#machinelearning#meta#model#stuff

Reinforcement learning with my Eat Melon! Demo This ...

5,747 views

2024-08-30

Reinforcement learning with my Eat Melon! Demo This demo is based on Karpathy's work. Link: https://bit.ly/raj_eatmelon #datascience #reinf...

#chatgpt#datascience#machinelearning#reinforcementlearning#rlhfs#techtok

Starting to see people productionizing GPT-3 workflow...

5,728 views

2023-03-11

Starting to see people productionizing GPT-3 workflows. I am a bug fan of using large language midels. Here is how one data science dealt wi...

#data#datascience#gpt3#largelanguagemodels#machinelearning#using

Async is the difference between waiting… and working....

5,725 views

2025-11-22

Async is the difference between waiting… and working. It lets multiple I/O-bound tasks run concurrently on a single thread. When one request...

#async#cookies#difference#dramatically#inside#lets#network#neural#one#quiver#run#seeing#time#tool#waiting#working

Trying to conserve tokens? Here are two approaches ma...

5,714 views

2025-11-08

Trying to conserve tokens? Here are two approaches making waves right now. TOON cuts down on repeated syntax in structured data by replacing...

#compressing#deepseek#drop#isn#making#ocr#percent#text#token#tokens#toon

Don’t do analysis for the sake of analysis. Your anal...

5,705 views

2022-02-22

Don’t do analysis for the sake of analysis. Your analysis should be synced with a business objective. #datascience #analysis #dataanalyst

#analysis#dataanalyst#datascience#don#sake#synced

AI only knows what's it's trained on. So beat it by d...

5,704 views

2023-02-21

AI only knows what's it's trained on. So beat it by doing something new. The video shows recent examples of marines beating a surveillance s...

#beat#datadrift#datascience#machinelearning#modelmonitoring#system

RAG systems don’t know what’s sensitive — unless you ...

5,688 views

2025-07-05

RAG systems don’t know what’s sensitive — unless you tell them. Let’s talk about why access control is essential in Retrieval-Augmented Gene...

#access#chunks#documents#entitlements#generation#let#protecting#rag#sensitive

This video focuses on the difference between Word2Vec...

5,687 views

2025-06-14

This video focuses on the difference between Word2Vec, standard Transformers and Sentence Transformers for creating document embeddings. It ...

#comparing#embeddings#get#like#sentence#transformer#transformers#word2vec
Video

SpeechT5 audio models getting added to transformers. ...

5,659 views

2023-02-08

SpeechT5 audio models getting added to transformers. #datascience #machinelearning #huggingface #speecht5 #speechmodels #audiomodels

#audiomodels#bin#datascience#datavisualization#histogram#histograms#huggingface#machinelearning#speechmodels#speecht5#statistics
Video

So what's inside those large language models? This vi...

5,655 views

2023-06-08

So what's inside those large language models? This video explains the data pipeline for high-quality training data used in the latest LLMs l...

#common#commoncrawl#data#datascience#efficient#gpu#large#largelanguagemodels#machinelearning#memory#pages#training#transformers#using

You might assume Vision-Language Models like Claude o...

5,647 views

2025-10-30

You might assume Vision-Language Models like Claude or CLIP would crush defect detection. But on Amazon’s new Kaputt Dataset, the old-school...

#character#claude#detection#language#like#model#models#reveals#specs#stress#testing#vision

The video covers Retrieval Augmented Generation (RAG)...

5,638 views

2024-05-10

The video covers Retrieval Augmented Generation (RAG), a very popular approach for combining large language models and information retrieval...

#apply#build#documents#generative#get#hits#largelanguagemodels#like#may#plateau#problems#rag#retrievalaugmentedgeneration#snowflake#technology

Working with Categorical data using ordinal, one hot ...

5,638 views

2022-09-17

Working with Categorical data using ordinal, one hot (dummy), and target encoding #datascience #statistics #analytics #featureengineering

#analytics#categorical#data#datascience#featureengineering#statistics

SpeechT5 audio models getting added to transformers. ...

5,634 views

2023-02-09

SpeechT5 audio models getting added to transformers. #datascience #machinelearning #huggingface #speecht5 #speechmodels #audiomodels

#datascience#huggingface#machinelearning#speech#speechmodels#speecht5
Video

The history of data science. I have since learned to ...

5,624 views

2024-02-18

The history of data science. I have since learned to make videos shorter and punchier.

#data#history#learned#make#science#since

Tips for working with small datasets. This includes u...

5,581 views

2025-03-09

Tips for working with small datasets. This includes using cross-validation, models like lasso, and running multiple iterations of feature im...

#cross#data#going#like#lot#validation

#stitch with @debtcollective Marketing and PR. This ...

5,570 views

2022-06-20

#stitch with @debtcollective Marketing and PR. This is a big topic and a lot of nuance isn’t in this video. Also relationships with academi...

#codetok#datascience#lot#machinelearning#mltok#stitch
Video

Opus.ai very cool demo! If you want to build similar ...

5,561 views

2023-04-05

Opus.ai very cool demo! If you want to build similar apps check out the text to code models. Santacoder is open source and they have shared ...

#bigcode#datascience#games#largelanguagemodels#machinelearning#opus#santacoder#text#texttocode#video

Interpreting stable diffusion #stabilitydiffusion #da...

5,537 views

2022-09-16

Interpreting stable diffusion #stabilitydiffusion #datascience #codetok #machinelearning #texttoimage

#codetok#datascience#diffusion#machinelearning#stabilitydiffusion#texttoimage
Video

Fine Tuning an Image Classifier on Indian Food Images

5,522 views

2022-08-04

Fine Tuning an Image Classifier on Indian Food Images

#classifier#fine#food#image#indian#tuning
Video

Let's learn from the best! Feel free to share your fa...

5,520 views

2024-01-03

Let's learn from the best! Feel free to share your favorites. My list comes from the Data Vis Dispatch list of favorite visualizations for 2...

#data#datavisualizations#dispatch#let#list#vis

#instructionfinetuning #rlhf #reinforcementlearning #...

5,506 views

2024-04-11

#instructionfinetuning #rlhf #reinforcementlearning #pretrain Target leakage is a very common problem and everyone should understand it. Th...

#crowdai#data#leakage#machinelearning#sarcos#target
Video

Sports! #datascience #analytics #codetok #machinelear...

5,495 views

2022-08-16

Sports! #datascience #analytics #codetok #machinelearning #rstats #footballanalytics #statistics

#analytics#codetok#datascience#footballanalytics#machinelearning#rstats
Video

AI for other than productivity. Let's talk about how ...

5,481 views

2023-09-28

AI for other than productivity. Let's talk about how people are really using AI. #datascience #machinelearning #rajistics #therapy Lilian We...

#beyond#datascience#gpt4#let#machinelearning#multimodal#pdf#productivity#therapy

Critical question when framing out analytic questions...

5,467 views

2022-09-13

Critical question when framing out analytic questions, since extrapolation has got me into trouble before. #datascience #analytics #codetok

#analytics#codetok#datascience#extrapolation#like#think
Video

With the writer's strike in the US this video reminds...

5,457 views

2023-07-20

With the writer's strike in the US this video reminds us of the human and environmental costs of building AI. Three critical components for ...

#building#easy#environment#ethicalai#fun#largelanguagemodels#machinelearning#openai#steps#wgastrike

Reply to @fondantlover datasaurus dozen howto #stats ...

5,447 views

2022-01-17

Reply to @fondantlover datasaurus dozen howto #stats #anscombe #datascience #analytics

#analytics#anscombe#datascience#dozen#rex#stats

#xgboost short history. #datascience #statistics #mac...

5,427 views

2022-05-14

#xgboost short history. #datascience #statistics #machinelearning #codetok

#codetok#datascience#kaggle#machinelearning#statistics#xgboost

Dive into model error metrics! From simple mean error...

5,410 views

2024-11-12

Dive into model error metrics! From simple mean error to Mean Squared Error (MSE) and Log Loss, let's see when you should use them. While MS...

#ang#better#error#function#log#logloss#loss#mean#mga#model#mse#nito#pagkakamali#squared

No one actually knows what a data scientist does, tak...

5,338 views

2022-12-07

No one actually knows what a data scientist does, take advantage of it.

#actually#data#knows#one#scientist#take

ChatDoctor is a great example of fine tuning a large ...

5,320 views

2023-03-22

ChatDoctor is a great example of fine tuning a large language model to get more factually correct output. This is an approach i expect many ...

#chatdoctor#chatgpt#datascience#largelanguagemodels#machinelearning#model
Video

Q* from OpenAI is getting the hype but let's focus on...

5,314 views

2023-11-28

Q* from OpenAI is getting the hype but let's focus on the basics of their organization and the limitations of GPT-4 around planning. This vi...

#accelerating#aiplanning#gpt4#largelanguagemodels#machinelearning#openai#planning#prague#pytorch#qstar#transformers#video#working
Video

Lets talk about how GPT-4 is going to affect enterpri...

5,289 views

2023-04-01

Lets talk about how GPT-4 is going to affect enterprise analytics. My upcoming public talks: AI Summit in Montreal on April 20 & Arize AI ev...

#analytics#april#datascience#enterprise#going#gpt#gpt4#machinelearning#openai

I know the pain. But there are ways to make it easy ...

5,287 views

2022-03-11

I know the pain. But there are ways to make it easy for people to use your code. #python #analysis #datascience

#analysis#datascience#know#let#pain#python

Document AI with LayoutLM #datascience #codetok #natu...

5,258 views

2022-06-05

Document AI with LayoutLM #datascience #codetok #naturallanguageprocessing #layoutml #huggingface #🤗 #ocr #deeplearning #multimodal

#codetok#datascience#huggingface#multimodal#naturallanguageprocessing#ocr

Baseline models for time series.

5,255 views

2024-10-09

Baseline models for time series.

#baseline#model#predicting#sales#series#time
Video

Andrew Ng wrote recently on this no test set approach...

5,252 views

2023-05-22

Andrew Ng wrote recently on this no test set approach that he is seeing when people are using prompt engineering. This is very different tha...

#andrew#datascience#machinelearning#model#production#promptengineering#see#set#test#validation
Video

Examining the data used for training our our LLMs. Op...

5,241 views

2023-04-20

Examining the data used for training our our LLMs. OpenAI is running into trouble in Europe since it won't disclose exactly what was used fo...

#data#databricks#dolly#fined#instruction#llama#mode#openai#reddit#redpajama#together#tuned

Data Quality in the AI Era - To learn more about this...

5,232 views

2024-11-20

Data Quality in the AI Era - To learn more about this example, check out Hannaneh Hajishirzi - OLMo: Accelerating the Science of Language Mo...

#data#improving#meta#models#molmo#multimodal#open#quality#time
Video

Speed run - 8 minute video on 16 Challenges for using...

5,226 views

2023-08-10

Speed run - 8 minute video on 16 Challenges for using large language models (LLMs) 1. Unfathomable Datasets 2. Tokenizer-Reliance 3. High Pr...

#challenges#highlights#language#large#largelanguagemodels#llms#machinelearning#models#paper

Anomaly detection is hard. This is an introduction to...

5,210 views

2022-09-26

Anomaly detection is hard. This is an introduction to anomaly detection algorithms. The video focuses on the results for ADBench and what da...

#analytics#anomaly#codetok#data#datascience#detection
Video

SetFit: Few Shot Learning for Text Classification

5,181 views

2022-10-28

SetFit: Few Shot Learning for Text Classification

#classification#contrastive#datascience#intuition#learning#machinelearning#setfit#shot#statistics#text

How AIrbnb customer support is using generative AI. T...

5,175 views

2023-01-16

How AIrbnb customer support is using generative AI. This is a great example of how @rajistics in context learning is growing and replacing t...

#airbnb#datascience#generativeai#largelanguagemodels#learning#machinelearning

Singular value decomposition is one of many low rank ...

5,168 views

2024-04-03

Singular value decomposition is one of many low rank methods when working with matrices. This video shares the intuition for why SVD matters...

#datascience#machinelearning#matrices#matrixalgebra#singularvaluedecomposition#svd
Video

Replying to @Davos What's the best algorithm? ü§î Th...

5,164 views

2023-06-26

Replying to @Davos What's the best algorithm? ü§î There is no best algorithm! This is an excellent reminder of the free lunch theorem; no a...

#1#algorithm#algorithms#best#datascience#machinelearning#nofreelunch
Video

Nat.dev playground is awesome. Should be a great remi...

5,144 views

2023-03-10

Nat.dev playground is awesome. Should be a great reminder of the diversity of large language models. #datascience #machinelearning #largelan...

#datascience#gpt3#largelanguagemodels#machinelearning#models#nat#natdev

We all want to get paid. But just know you will end u...

5,128 views

2022-08-19

We all want to get paid. But just know you will end up miserable. #datascience #codetok #analytics

#analytics#codetok#datascience#get#paid#want

The Secrets of OpenAI O1 comes down to scaling test-t...

5,114 views

2024-12-20

The Secrets of OpenAI O1 comes down to scaling test-time compute. Hugging Face shared some research on ways to improve test-time compute usi...

#compute#model#models#openai#scaling#secrets#test#time

How enterprises are dealing with ChatGPT it’s a prett...

5,090 views

2023-02-05

How enterprises are dealing with ChatGPT it’s a pretty familiar cycle of grief. The good thing is it does open up lots of cool use cases. #...

#chat#chatgpt#datascience#got#gpt#hey

In this video, I explain Jason Wei’s insight from his...

5,071 views

2025-07-16

In this video, I explain Jason Wei’s insight from his recent blog post on the asymmetry of verification and what he calls Verifiers’ Law: AI...

#asymmetry#jason#law#like#math#progress#tasks#verification#verifier#wei

Favorite tweet today. #statistics #datascience #codet...

5,071 views

2022-05-19

Favorite tweet today. #statistics #datascience #codetok #machinelearning

#codetok#datascience#favorite#machinelearning#statistics#way
Video

This group is holding COLLIDE Data Conference on Octo...

5,046 views

2023-08-30

This group is holding COLLIDE Data Conference on October 3-4 at Center Stage Theater 🎸🎭 in the heart of midtown Atlanta Georgia. Regis...

#august#collide#conference#data#datascience#earnings#group#holding#news#nvidia

Getting even bigger with all the new vision models

5,032 views

2024-09-25

Getting even bigger with all the new vision models

#advantage#ask#even#getting#models#take
Video

Jokes explained - news in mid-May 2023 Google introdu...

5,010 views

2023-05-12

Jokes explained - news in mid-May 2023 Google introduced Bard2 which performs on par with GPT3.5 and Claude from Anthropic. Google also anno...

#amazon#anthropic#cohere#google#ibm#models#new#news#nvidia

#onthisday Time Series Decomposition is a great techn...

5,006 views

2024-06-12

#onthisday Time Series Decomposition is a great technique for starting to understand a time series.

#decomposition#let#onthisday#series#start#time
Video

Are you GPU Poor? A deep dive into the state of GPUs ...

4,999 views

2023-11-22

Are you GPU Poor? A deep dive into the state of GPUs based on the work of Dylan Patel of Semi Analysis. How are you coping with the lack of ...

#deep#dive#going#gpu#gpus#large#machinelearning#market#models#nvidia#poors#rich#semianalysis#state
Video

One of my older videos Predicting crime

4,995 views

2023-07-05

One of my older videos Predicting crime

#crime#face#hugging#older#one#predicting#sentiment#spreadsheet#style#text#transfer#transformers#using#videos#wow
Video

Entropy can be a useful measure in machine learning. ...

4,992 views

2023-04-27

Entropy can be a useful measure in machine learning. Entropy and information gain is used in building decision trees. I have also seen entro...

#credit#datascience#decisiontrees#entropy#featureengineering#informationgain#know#learning#liability#look#machine#machinelearning#rating

Crime seems easy to predict, but is super messy. #dat...

4,991 views

2022-07-02

Crime seems easy to predict, but is super messy. #datascience #crimetok #chicago #statistics #crimonology #machinelearning #codetok #aisnake...

#chicago#crime#crimetok#datascience#numbers#statistics
Video

Intro to Conformal Prediction

4,948 views

2023-09-28

Intro to Conformal Prediction

#conformal#conformalprediction#datascience#getting#intro#older#prediction#predictioninterval#statistics

Uber’s FixRLeak system finds leaks with SonarQube, sc...

4,937 views

2025-11-11

Uber’s FixRLeak system finds leaks with SonarQube, scopes them with Tree-sitter AST analysis, then lets GenAI safely patch only what it unde...

#ast#automating#chatgpt#code#datascience#fixes#fixrleak#largelanguagemodels#leak#leaks#machinelearning#resource#right#scale#trainingml#uber

Is explainability important for you? #datascience #ex...

4,930 views

2022-08-06

Is explainability important for you? #datascience #explainability #interpretability #statistics #codetalk #machinelearning

#datascience#explainability#interpretability#price#statistics#understand

Grok 3 - The video explains how Grok 3's performance ...

4,899 views

2025-02-23

Grok 3 - The video explains how Grok 3's performance claims rely on majority voting across 64 predictions (cons@64) rather than single predi...

#cons#consensus#graphs#grok#one#performance#predictions#scores#shady#single#times#video
Video

Applying PaLM to the medical domain by using instruct...

4,893 views

2022-12-29

Applying PaLM to the medical domain by using instruction prompt tuning

#applying#domain#instruction#medical#palm#using

Data drift analysis is a must for production workload...

4,848 views

2023-03-13

Data drift analysis is a must for production workloads. Here is Uber’s D3 system fie automated drift analysis. This video covers types of da...

#automated#data#drift#issues#prophet#uber

Using recursive feature elimination for feature selec...

4,836 views

2025-02-21

Using recursive feature elimination for feature selection for machine learning.

#deep#demo#eat#elimination#feature#features#learning#like#machine#melon#model#recursive#reinforcement#selection#using#would

Loss Functions - simple example of MAE versus RSME #d...

4,820 views

2022-08-30

Loss Functions - simple example of MAE versus RSME #datascience #statistics #analytics #codetok #regression

#analytics#arse#codetok#datascience#regression#statistics

NotebookLM - Convert your notes into a podcast Notebo...

4,808 views

2024-10-30

NotebookLM - Convert your notes into a podcast NotebookLM: https://notebooklm.google/ Notebook LLama: https://github.com/meta-llama/llama-re...

#audio#conversation#github#google#llama#notebook#notebookllama#notebooklm#open#podcast#recipes#source#version
Video

Obliviate is now possible for LLMs. Microsoft researc...

4,788 views

2023-10-07

Obliviate is now possible for LLMs. Microsoft researchers share an approach to get Large Language Models to unlearn information. #harrypotte...

#approximate#harry#harrypotter#largelanguagemodels#llms#machinelearning#obliviate#potter#summary#unlearning

GPT3.5 takes the bar exam with very little tuning. It...

4,775 views

2022-12-30

GPT3.5 takes the bar exam with very little tuning. It does pretty well. #gpt #datascience #machinelearning #barexam #law

#bar#datascience#exam#gpt#law#pretty
Video

The reality of AI Agents from Embra. While everyone h...

4,773 views

2023-08-23

The reality of AI Agents from Embra. While everyone hypes up Agents its a lot harder to make useful products based on Agents. #machinelearni...

#agents#aiagents#autogpt#datascience#embra#machinelearning

Cleaning data is such a pain. I remember having over ...

4,773 views

2022-09-30

Cleaning data is such a pain. I remember having over 130+ unique combinations for US States in one project.

#cleaning#data#pain#remember#teddy#unique

Reply to @shaggy335 #datascience #statistics #analyti...

4,768 views

2022-05-03

Reply to @shaggy335 #datascience #statistics #analytics #techtok #machinelearning

#analytics#datascience#machinelearning#something#statistics#techtok

What if Santa’s biggest problem this year is optimiza...

4,764 views

2025-11-28

What if Santa’s biggest problem this year is optimization? Packing 200 Christmas tree toys into the smallest box is harder than it looks. Gr...

#challenge#going#kaggle#noel#optimization#packing#papai#que#rvore#santa#tree#try#uma#voc#year

ML engineers discuss why cutting-edge academic models...

4,751 views

2025-02-27

ML engineers discuss why cutting-edge academic models aren't production-ready and often don't justify the implementation costs. They share p...

#data#lot#model#models#much#production

Mixing in a bit of law with the usual data science. L...

4,720 views

2022-10-28

Mixing in a bit of law with the usual data science. Let me know if this is interesting or you waiting for the deep dive on dbscan clustering...

#craiyon#dallemini#dolly#stablediffusion#texttoimage#trademark

Text to Chart. It’s easier than ever to build great c...

4,705 views

2023-02-15

Text to Chart. It’s easier than ever to build great charts using libraries like plotly or matplotlib. Are other people using ChatGPT for thi...

#chatgpt#datascience#like#machinelearning#matplotlib#plotly

Stylometric analysis—specifically the detection of ov...

4,664 views

2025-06-01

Stylometric analysis—specifically the detection of overused phrases known as "slop"—can reveal hidden changes in a language model's training...

#fingerprints#language#model#models#outputs#phrases#sam#slop#stylometry#uncovered

Wrap up of current events going on with chat includin...

4,660 views

2023-02-17

Wrap up of current events going on with chat including #openai #chatgpt #bing #amazon #datascience #machinelearning

#amazon#bing#chatgpt#like#models#openai

DeepSeek-R1 didn’t copy human reasoning—it learned it...

4,654 views

2025-09-19

DeepSeek-R1 didn’t copy human reasoning—it learned it. With pure RL (GRPO), it jumped from 15% to 80% on the AIME exam and began saying “wai...

#deepseek#gets#human#learning#model#performance#reasoning#superhuman#wait#works

Nvidia Prismer model for image captioning and zero sh...

4,653 views

2023-03-15

Nvidia Prismer model for image captioning and zero shot visual question answering. It uses and ensemble or mixture of experts approach. #dat...

#datascience#machinelearning#model#nvidia#prismer#uses

Try it out, link in comments. #huggingface #datascien...

4,640 views

2022-06-19

Try it out, link in comments. #huggingface #datascience #reinforcementlearning #deeplearning #codetok #mltok Earlier weeks: @rajistics @raji...

#codetok#datascience#deeplearning#huggingface#mltok#reinforcementlearning

History of the term regression and regression to the ...

4,633 views

2022-02-08

History of the term regression and regression to the mean #statistics #datascience #galton #regression #heriditary

#datascience#galton#heriditary#people#regression#statistics

Deep dive into reasoning models. Notebook is freely a...

4,624 views

2025-06-04

Deep dive into reasoning models. Notebook is freely available so go run it yourself. Notebook: https://github.com/rajshah4/LLM-Evaluation/b...

#able#benchmarks#don#going#illusion#like#models#reasoning#style#thinking#use#want

Can you tell I have been building agent workflows at ...

4,623 views

2024-12-31

Can you tell I have been building agent workflows at work?

#agentic#customer#data#demo#demoing#goes#production#tables#technical#workflows#wrong
Video

OpenAI dropped some big releases for its developer da...

4,615 views

2023-11-12

OpenAI dropped some big releases for its developer day. Let's catch up on the news in early November 2023. #openai #meta #nvidia #01.ai #h2o...

#china#google#h2o#meta#news#nov#nvidia#openai#roundup

Data scientists will typically use regularization, wh...

4,611 views

2022-11-12

Data scientists will typically use regularization, which means no p values. #machinelearning #datascience #statistics #pvalues

#data#datascience#machinelearning#pvalues#scientists#statistics

In this skit, a junior AI engineer tries to solve eve...

4,608 views

2025-03-29

In this skit, a junior AI engineer tries to solve everything by giving the model more "thinking time" — but runs into the hard truth about v...

#apps#building#chess#compute#dify#generative#gui#introduction#isn#model#tasks#thinking#time
Video

Isolation Forests for Anomaly or Outlier Detection

4,605 views

2024-08-03

Isolation Forests for Anomaly or Outlier Detection

#anomaly#anomalydetection#daten#detection#die#forests#isolation#ist#madeinhotelroom#outlier#und

Reply to @bosstoastmaker shallow learning with tensor...

4,593 views

2022-02-20

Reply to @bosstoastmaker shallow learning with tensorflow playground #datascience #tensorflow #python #machinelearning

#datascience#going#let#machinelearning#python#tensorflow

Breaking down how advances in AI, from GPT to Veo 3 —...

4,566 views

2025-05-24

Breaking down how advances in AI, from GPT to Veo 3 — owe their performance to massive, often ethically questionable datasets. It traces the...

#data#gpt#images#know#models#quantization#set#slimming#training

This video discusses a Carnegie Mellon study comparin...

4,564 views

2025-05-11

This video discusses a Carnegie Mellon study comparing prompt-based inference with fine-tuned large language models. The research found that...

#approach#context#examples#fine#hallucinations#language#large#models#performance#practical#tuning

Reply to @notryantaylor here it is without music. t...

4,539 views

2022-02-11

Reply to @notryantaylor here it is without music. this is for my 4 kids who all text me that they are above average.

#average#notryantaylor#percent#reply#think#thought
Video

Thanks to barrnanas and AmplifyPartners

4,538 views

2023-10-10

Thanks to barrnanas and AmplifyPartners

#amplifypartners#barrnanas#thanks

An Introduction to Dify.AI - A UI based tool for buil...

4,518 views

2025-03-23

An Introduction to Dify.AI - A UI based tool for building Generative AI Agentic Workflows or Applications. I have a longer video on YT going...

#able#applications#build#building#code#genai#generative#improve#like#recommenders#recsys

Deep dive into Group Relative Policy Optimization (GR...

4,498 views

2025-02-16

Deep dive into Group Relative Policy Optimization (GRPO), a Reinforcement Learning algorithm used by Deepseek in their R1 reasoning model. L...

#algorithms#app#deep#dive#going#group#interactive#kind#learning#like#look#model#notebook#outlier#relative#see#visualization
Video

Some common data distributions when modeling includin...

4,498 views

2023-02-03

Some common data distributions when modeling including skewed and zero inflated. There are many other distributions, but just wanted people ...

#common#data#datadistribution#datascience#distributions#statistics#tweedie#zeroinflated
Video

#onthisday Showing the latent space for stable diffus...

4,495 views

2023-09-10

#onthisday Showing the latent space for stable diffusion. #stablediffusion #datascience #machinelearning #codetok #umapêpravocê

#analogy#chatgpt#codetok#datascience#largelanguagemodels#machinelearning#onthisday#politics#stablediffusion#umap#well#works

Quantization used to be a post-training compromise, s...

4,478 views

2025-11-16

Quantization used to be a post-training compromise, smaller and faster but at the cost of accuracy. Kimi K2-Thinking flips the script using ...

#compromise#faster#int4#kimi#language#models#nvidia#outsmart#post#quantization#reasoning#training#used#vision

Highlighting some great work investigating basic beha...

4,478 views

2025-02-08

Highlighting some great work investigating basic behavior of LLMs and finding issues with their reliability: Do Large Language Model Benchma...

#arxiv#basic#benchmark#compressing#distillation#investigating#kept#language#large#llms#models#night#org#pdf#quantization#reliability#using

Most people evaluate RAG the wrong way — just checkin...

4,467 views

2025-03-30

Most people evaluate RAG the wrong way — just checking if the answer is “correct.” But great RAG needs to answer: how did it get there? In t...

#actually#answer#based#explained#get#metrics#need#rag#ragas#simply

Recommenders! I saw an internal presentation from Sim...

4,447 views

2024-05-25

Recommenders! I saw an internal presentation from Simran on the effectiveness of implicit for a recommender system and wanted to share it. C...

#collaborative#explicit#filtering#implicit#need#recommender

AI researchers assumed more sensory data—like video—w...

4,435 views

2025-06-27

AI researchers assumed more sensory data—like video—would lead to smarter, more reasoning-capable models. But it didn’t work. While video mo...

#hyperparameter#language#like#models#optimization#reasoning#smarter#text#video#wasting#without#xgboost
Video

Labeling companies like Scale are hiring people to bu...

4,425 views

2023-05-09

Labeling companies like Scale are hiring people to build and improve models based on these skills. By next year people in these fields shoul...

#chatgpt#datascience#going#hallucinations#language#large#largelanguagemodels#like#llms#machinelearning#models#scale#trainingml

ggplot, matplotlib, plotly, and seaborn are what data...

4,423 views

2022-10-03

ggplot, matplotlib, plotly, and seaborn are what data scientists use to make a plot or graph. #datascience #visualization #plots #analytics...

#analytics#datascience#ggplot#matplotlib#plots#visualization

MobileLLM - A great paper that details experiments fo...

4,389 views

2024-07-11

MobileLLM - A great paper that details experiments for the efficient architecture. It includes using SwiGLu, Deeper/thiner architect, reduci...

#blocks#efficient#mobilellm#model#sram#using

New good stuff. Let’s compare the performance, cost, ...

4,381 views

2025-05-01

New good stuff. Let’s compare the performance, cost, and task alignment for using OpenAI o3 versus a small model trained with Group Relative...

#email#entropy#gain#grpo#information#learning#like#machine#model#task#trained#understanding
Video

Joys of autocomplete, who is with me? #datascience #p...

4,379 views

2022-01-28

Joys of autocomplete, who is with me? #datascience #programming #vscode #jupyternotebook #coding #tabcompletion #python

#coding#datascience#jupyternotebook#programming#tabcompletion#vscode

X-decoder from Microsoft. Check out the instructional...

4,356 views

2023-02-15

X-decoder from Microsoft. Check out the instructional text demo. I added in video released by the team at the bottom. If too many people don...

#datascience#machinelearning#model#pix2pix#text#x
Video

GPUs driven by NVIDIA are the key to today's AI. With...

4,349 views

2023-09-19

GPUs driven by NVIDIA are the key to today's AI. Without this compute we would not have the models like GPT-4. Let's review why GPU performa...

#deeplearning#docs#driven#driving#generative#gpus#increase#machinelearning#nvidia#power

Reply to @coronavirusvevo #xgboost #regression #stati...

4,326 views

2022-02-18

Reply to @coronavirusvevo #xgboost #regression #statistics #datascience #algorithms

#algorithms#data#datascience#regression#statistics#xgboost

The video contrasts complex neural networks with simp...

4,305 views

2025-03-12

The video contrasts complex neural networks with simpler, interpretable models like Generalized Additive Models (GAMs), which provide clear ...

#approaches#better#complex#contrasts#decisions#ensembling#interpretable#like#majority#models#rules#simpler#video#voting

Chronicles of OPT Training #meta #nlp #datascience #m...

4,297 views

2022-05-12

Chronicles of OPT Training #meta #nlp #datascience #machinelearning #deeplearning #codetok #python

#codetok#datascience#deeplearning#machinelearning#meta#nlp

Agents just learned to talk without words. They pass ...

4,238 views

2025-12-05

Agents just learned to talk without words. They pass thoughts directly in latent space, not text. It is faster, cheaper, and even boosts acc...

#agents#communication#english#latent#latentmas#learned#talk#text#thoughts#without#words

Notebook walkthrough of the 11B model using Hugging F...

4,233 views

2024-09-30

Notebook walkthrough of the 11B model using Hugging Face Transformers on Snowflake The notebook highlights: Short background on vision langu...

#going#image#model#models#transformers#use

Reminding everyone not to fall for the "research" rep...

4,233 views

2024-06-08

Reminding everyone not to fall for the "research" reports from Forrester or Gartner. They should be treated like marketing material or PR ma...

#companies#forrester#like#make#marketing#research

Using diffusion for object detection in diffusiondet....

4,230 views

2022-11-21

Using diffusion for object detection in diffusiondet. #datascience #machinelearning #objectdetection #computervision

#computervision#datascience#detection#machinelearning#object#objectdetection

Red light camera #chicago #datascience #redlightcame...

4,218 views

2022-04-30

Red light camera #chicago #datascience #redlightcamera #anomalydetection #statistics #techtok #analytics

#anomalydetection#chicago#datascience#look#redlightcamera#statistics

This skit highlights key data science concepts, aroun...

4,196 views

2024-05-19

This skit highlights key data science concepts, around error analysis and iterative improvement in building models. It shows how data scient...

#data#errors#learning#models#scientist#scientists

Facts #datascience #techtok #analytics #impostersyndrome

4,166 views

2022-04-28

Facts #datascience #techtok #analytics #impostersyndrome

#academically#analytics#datascience#impostersyndrome#rigorous#techtok

Relevance maps for image classification. Model explai...

4,137 views

2022-11-21

Relevance maps for image classification. Model explainability is always important. #datascience #explainability #machinelearning #imageclass...

#datascience#explainability#image#imageclassication#machinelearning#model

Alternatives to Transformers. There is a lot of resea...

4,110 views

2024-12-16

Alternatives to Transformers. There is a lot of research in this area, this video from last year highlights Mamba which is using a state spa...

#alternatives#attention#lot#mamba#really#state

Wow! I am impressed with Claude’s new Skills feature....

4,109 views

2025-10-18

Wow! I am impressed with Claude’s new Skills feature. It can make my life easier (and I know I sound like a shill, but this is super useful ...

#api#call#claude#code#evaluation#gepa#like#new#optimization#prompt#reflective#skills

My take on Objaverse, Llama, and Alpaca. Not a lot of...

4,104 views

2023-03-25

My take on Objaverse, Llama, and Alpaca. Not a lot of respect for copyright or contract terms. #largelanguagemodels #datascience #machinelea...

#alpaca#dataset#llama#model#objaverse#openai

This video explains how scaling laws—particularly fro...

4,046 views

2025-04-16

This video explains how scaling laws—particularly from the Chinchilla paper—reveal a tradeoff between model size, training data, and compute...

#chatgpt#compute#large#laws#many#model#models#modifying#scaling#smaller#train#ways

Watch out for the flashy new algorithm.

4,016 views

2024-10-22

Watch out for the flashy new algorithm.

#data#different#going#revolution#science#try

Using AI for Pose Detection, this is such a cool appl...

4,008 views

2022-09-05

Using AI for Pose Detection, this is such a cool application. #datascience #deeplearning #codetok #posedetection #sportsanalytics

#body#codetok#datascience#deeplearning#posedetection#sportsanalytics
Video

Apple's MM1: Methods Analysis & Insights from Multimo...

4,007 views

2024-03-19

Apple's MM1: Methods Analysis & Insights from Multimodal LLM Pre-training - https://arxiv.org/pdf/2403.09611.pdf Cohere int8 & binary Embedd...

#binary#cohere#contextual#embeddings#int8#let#model#pdf
Video

Evaluating generative models means considering many f...

4,000 views

2023-06-24

Evaluating generative models means considering many factors including prompts tokenization and evaluating generated results. This video shou...

#datascience#different#evaluating#generative#largelanguagemodels#leaderboards#machinelearning#mmlu#modelevaluation#models#nuances

8 Ways to improve your RAG Application 1. Metadata Fi...

3,999 views

2025-05-09

8 Ways to improve your RAG Application 1. Metadata Filter 2. Semantic Chunking 3. Visual Language Model 4. Query Decomposition 5. Better Emb...

#application#better#chunking#improve#rag#reranker#semantic#using#ways
Video

Object detection for AI using yolov6 Check out the de...

3,991 views

2023-07-13

Object detection for AI using yolov6 Check out the demo at: https://huggingface.co/spaces/rajistics/yolov6

#check#data#demo#detection#every#license#llm#model#new#object#things#top#training#using#yolov6

Software exec at the end is the best. Your quick intr...

3,976 views

2022-10-30

Software exec at the end is the best. Your quick intro to patents, trademarks, copyright, and licenses. I see too many comments where peopl...

#copyright#licenses#patents#people#software#use

It’s rough for statisticians, machine learning is so ...

3,957 views

2022-09-15

It’s rough for statisticians, machine learning is so popular #datascience #analytics #statistics #machinelearning

#analytics#datascience#fucking#machinelearning#rough#statistics
Video

Data science is often team work and you want to try t...

3,946 views

2023-09-04

Data science is often team work and you want to try to avoid toxic teams. #datascience #rajistics #teamwork #kaggle

#data#datascience#kaggle#look#often#science#teamwork

An increasingly popular approach for labeling data is...

3,941 views

2024-12-22

An increasingly popular approach for labeling data is getting assistance with LLMs. Here this approach is highlighted using MLFlow. This is ...

#approach#data#human#labeling#lot#model

The enterprise AI landscape has radically shifted in ...

3,936 views

2025-08-07

The enterprise AI landscape has radically shifted in 2025. Anthropic has overtaken OpenAI in enterprise usage by offering broader cloud acce...

#anthropic#api#based#enterprise#enterprises#llm#market#menlo#mid#open#openai#update#usage#year

Quick intro, let me know if a deeper dive is useful. ...

3,896 views

2022-07-19

Quick intro, let me know if a deeper dive is useful. #translation #meta #datascience #machinelearning #huggingface

#datascience#huggingface#language#machinelearning#meta#translation

Error analysis is an important skill in analytics.

3,895 views

2024-12-10

Error analysis is an important skill in analytics.

#air#also#analysis#error#improve#like#model
Video

Let's up your LLM game by going over the use of promp...

3,890 views

2024-02-22

Let's up your LLM game by going over the use of prompting strategies fine tuning and using synthetic datasets. This was motivated by some gr...

#data#fine#open#predibase#source#synthetic

J.P. Morgan processes 50 million transactions a day —...

3,869 views

2025-10-27

J.P. Morgan processes 50 million transactions a day — and they didn’t use GPT-5. By layering rules, text similarity, and a tiny 1.7 M-parame...

#actually#deployment#enterprise#gpt#jpmc#million#model#models#parameter#rules#transactions#works

Collaborative filtering is a very popular and useful ...

3,868 views

2025-05-21

Collaborative filtering is a very popular and useful way to build a recommender. However, getting explicit feedback is hard, and that is whe...

#building#explicit#feedback#implicit#need#one#recommenders#using#video#watch

Just how smart is ChatGPT and other #largelanguagemod...

3,868 views

2023-01-04

Just how smart is ChatGPT and other #largelanguagemodels? Big Bench is a set of benchmark tests to asses the performance of the models. And ...

#chat#chatgpt#datascience#largelanguagemodels#machinelearning#models

Optimal Transport algorithms to efficiently allocate ...

3,864 views

2025-04-23

Optimal Transport algorithms to efficiently allocate resources—in this case, croissants from eight bakeries to five cafes. It begins by cons...

#algorithms#bakeries#croissants#distance#optimal#regularization#transport#using

ShinkaEvolve pairs evolutionary algorithms with LLMs ...

3,848 views

2025-09-27

ShinkaEvolve pairs evolutionary algorithms with LLMs to invent new solutions faster. Using novelty-based rejection, smarter parent selection...

#algorithms#chica#evolutionary#evolve#invent#llm#llms#meet#new#pairs#search#shinkaevolve

A reminder that most enterprises favor Apache and MIT...

3,845 views

2023-01-06

A reminder that most enterprises favor Apache and MIT licenses. As a developer, use what you please. But to reach people working within comp...

#datascience#licenses#machinelearning#software#softwarelicensing#use

5 things to look for when a new model is announced Li...

3,831 views

2024-04-20

5 things to look for when a new model is announced License Real Open Source? - Apache 2 Commercial use? Strange conditions? Size of the mode...

#attention#context#data#gpus#length#license#llmineuur_25#llms#longer#model#new#ring#ringattention#training#transformers
Video

Hallucinations from large language models are a conce...

3,818 views

2023-05-06

Hallucinations from large language models are a concern. However balance them against the effectiveness of these models and the risks of usi...

#datascience#hallucinations#largelanguagemodels#machinelearning#model#models#practicalai
Video

Statistics sounds heavy but a lot of concepts are ver...

3,805 views

2024-01-07

Statistics sounds heavy but a lot of concepts are very useful and can save you a lot of effort. This video is reminder of the many ways we u...

#concepts#heavy#lot#random#sounds#statistics

Research on productivity with the new AI code tools f...

3,789 views

2025-10-10

Research on productivity with the new AI code tools from Stanford, inspired their talk I saw at the MLOps summit. Lots of great insights. Th...

#code#coding#engineering#lot#new#productivity#quality#quantifying#research#software#stanford#tools#way

LLMs can write SQL. That’s not the hard part. The har...

3,785 views

2025-11-19

LLMs can write SQL. That’s not the hard part. The hard part is making sure the math matches how the business actually works. Docs help AI un...

#business#guessing#hard#layer#llms#part#revenue#semantic#sql#text#write

Typical issues that often come up in everyday data sc...

3,773 views

2022-12-04

Typical issues that often come up in everyday data science. Data scientists only spend a small amount of time on algorithms. #datascience #...

#data#datascience#going#machinelearning#time#want

Everyone says “just add more agents.” This new Google...

3,768 views

2025-12-14

Everyone says “just add more agents.” This new Google + MIT paper tested 180 multi-agent setups and found something uncomfortable: on averag...

#agent#agents#architecture#coordination#everyone#multi#paper#says#scaling#systems#task

As we store more information as vectors or embeddings...

3,761 views

2024-04-16

As we store more information as vectors or embeddings vector databases are gaining importance. For small amounts of embeddings numpy or FAIS...

#databases#datascience#embeddings#faiss#pinecone#vector

Building a Claude 3.7 AI Researcher: The Framework Di...

3,759 views

2025-04-24

Building a Claude 3.7 AI Researcher: The Framework Dilemma I built this three ways using Claude 3.7's extended thinking capabilities with a ...

#agno#based#claude#code#framework#hallucinations#lines#research#thinking#transluce#truthfulness#using
Video

Your Ai News this week. #google #openai #meta #apple ...

3,742 views

2023-09-16

Your Ai News this week. #google #openai #meta #apple #tesla #rajistics

#apple#explaining#google#information#latent#let#llms#meta#news#numbers#openai#politicians#see#space#tesla#using

Encoders come in three flavors: * Encoder only conver...

3,739 views

2025-09-06

Encoders come in three flavors: * Encoder only converts single texts into embeddings. * Bi-encoder encodes queries and documents separately ...

#cross#documents#encoder#encoders#explained#one#rerankers#token

I really liked this paper from DeepMind on Synthetic ...

3,736 views

2024-04-18

I really liked this paper from DeepMind on Synthetic Data. It highlights a lot of interesting uses of synthetic data along with concerns ab...

#arxiv#best#data#learned#lessons#org#pdf#practices#really#synthetic

Unit Testing Deep Dive: ⚡ Evaluating Unit Tests with ...

3,733 views

2025-02-10

Unit Testing Deep Dive: ⚡ Evaluating Unit Tests with LMUnit 🎯 Polar plot visualizations of multi-dimensional scores 🔬 K-means clustering ...

#api#contextual#going#language#like#llms#lmunit#natural#request#see#testing#tests#unit#want
Video

WorkArena and WebArena are some newer real benchmarks...

3,731 views

2024-03-15

WorkArena and WebArena are some newer real benchmarks for real-world tasks. To build wider automation, itÔøΩs going to be essential to solve...

#automation#benchmarks#gpt#largelanguagemodels#newer#real#state#tasks#using#webarena#workarena#world

Six distributions. One video. A surprisingly relatabl...

3,722 views

2025-11-29

Six distributions. One video. A surprisingly relatable breakdown of why the real world does not follow the bell curve. Diving into Normal, P...

#bell#curve#distributions#engineer#every#law#normal#one#power#real#relatable#six#surprisingly#tail#video

Glitch tokens can have some unusual effects. The most...

3,715 views

2024-05-12

Glitch tokens can have some unusual effects. The most well known was the SolidGoldMagikarp token. The folks at Cohere dug into this and spen...

#glitchtokens#largelanguagemodels#like#time#token#tokens

Fundamentals, but I get asked about this all the time.

3,714 views

2025-03-08

Fundamentals, but I get asked about this all the time.

#data#get#help#need#one#science#time

Cosine similarity is a must know when working with ve...

3,690 views

2022-10-14

Cosine similarity is a must know when working with vectors. It’s very useful and widely used in #machinelearning #datascience #statistics S...

#datascience#machinelearning#python#statistics#use#vectors

Picking a GPU for deep learning based on Tim Dettmers...

3,688 views

2023-01-17

Picking a GPU for deep learning based on Tim Dettmers classic blog post. #datascience #machinelearning #gpu #deeplearning (had to repost thi...

#datascience#deeplearning#gpu#machinelearning#tim#using

Knowledge distillation helps make smaller models that...

3,677 views

2025-03-13

Knowledge distillation helps make smaller models that work well. DistilBERT is a popular small model created using this method. Resources: D...

#building#data#distillation#knowledge#larger#model#models#small#smaller#using#well

Researchers at Princeton ran 20,000 tests across nine...

3,633 views

2025-10-19

Researchers at Princeton ran 20,000 tests across nine benchmarks—spending $40,000—to see how AI agents really perform. Turns out the Agent M...

#agents#gemini#imo#like#made#math#models#often#prompting#ran#researchers#solving#using
Video

Its uncomfortable how good this analogy is. I have so...

3,619 views

2023-09-09

Its uncomfortable how good this analogy is. I have so much more material i left out. Enjoy this dont know how long it will stay up. #largela...

#analogy#apple#chatgpt#effects#good#google#largelanguagemodels#meta#new#news#office#openai#politics#productivity#study#tesla#uncomfortable#worker

4 Data Science Fails. These are a handful of ways tha...

3,615 views

2025-06-12

4 Data Science Fails. These are a handful of ways that society pushes back on data science approaches. It's good to understand why these wer...

#algorithm#alpha#approaches#data#examples#failed#get#like#medicine#model#superhuman

Use it! #python #conda #codetok #datascience

3,607 views

2022-05-16

Use it! #python #conda #codetok #datascience

#codetok#conda#datascience#python#suscr#use

Transfer learning and how we have built models like C...

3,605 views

2024-09-12

Transfer learning and how we have built models like ChatGPT. Its a long talk, so enjoy.

#able#data#enjoy#gave#going#learns#like#model#models#recently#talk

In a previous video, I focused on OpenAI’s model, but...

3,605 views

2025-02-04

In a previous video, I focused on OpenAI’s model, but this issue goes far beyond just one example. AI content detectors suffer from the base...

#art#beating#cheaters#didn#false#focused#grpo#issue#openai#positive#previous#rate#students#trainer#using#video

Reviewing some new research looking into prompting ve...

3,604 views

2024-05-07

Reviewing some new research looking into prompting versus fine tuning. They both have their place, but prompting performance can continue to...

#examples#fine#models#performance#prompting#tuning

his video breaks down how Gemini 2.5 Pro, a publicly ...

3,591 views

2025-08-01

his video breaks down how Gemini 2.5 Pro, a publicly available model, solved 5 out of 6 problems from the IMO 2025 without any fine-tuning. ...

#context#dynamics#gemini#gold#imo#implicit#learning#model#pro#problems#training#without
Video

The future will be many different LLMs some open sour...

3,585 views

2023-12-24

The future will be many different LLMs some open source and some proprietary. Other like Yann Lecun think differently. Yann's thread: https:...

#different#future#largelanguagemodels#llms#many#model#models#open#proprietary#six#source#yann

In this deep dive, I go beyond the RAG basics to focu...

3,581 views

2025-10-13

In this deep dive, I go beyond the RAG basics to focus on the most critical component: Retrieval. We'll provide a practical framework for th...

#bm25#deep#dive#embeddings#going#kind#like#look#one#rag#retrieval#right

Fun way to talk about K-means algorithm #datascience ...

3,539 views

2022-08-22

Fun way to talk about K-means algorithm #datascience #codetok #analytics #machinelearning

#algorithm#analytics#codetok#data#datascience#machinelearning
Video

LLMs are approximate retrievers that are mimicking pl...

3,526 views

2023-08-12

LLMs are approximate retrievers that are mimicking plans rather than truly planning. Great argument put forth by Subbarao Kambhampati who is...

#aiplanning#aireasoning#gpt4#largelanguagemodels#llms#machinelearning#planners#reasoning#retrieval#versus
Video

Why you want prediction intervals instead of point pr...

3,516 views

2022-09-20

Why you want prediction intervals instead of point predictions #datascience #machinelearning #statistics #predictioninterval

#datascience#machinelearning#prediction#predictioninterval#statistics#want

Oldie but still very relevant.

3,514 views

2025-01-28

Oldie but still very relevant.

#augmented#didn#document#documentation#explained#flamingo#generate#generation#hey#paper#quickly#rag_#responses#retrieval#review#sections#share#specific

I need to focus on adding more Regularization to my l...

3,512 views

2022-11-19

I need to focus on adding more Regularization to my life. #datascience #statistics #regularization

#datascience#hammering#maybe#regularization#statistics#videos

Cleanlab is open source and will improve your data qu...

3,507 views

2023-01-30

Cleanlab is open source and will improve your data quality. It’s so underrated. This was hard to record vertically, so go try it out. #datas...

#cleanlab#data#datascience#machinelearning#model#see

Repost which has held up well: Language Models like C...

3,499 views

2024-04-10

Repost which has held up well: Language Models like ChatGPT can be modified by several methods including Prompting Instruction Fine-Tuning a...

#chatgpt#datascience#like#model#rlhf#using

Thinking about the size of numbers becomes important ...

3,497 views

2024-05-17

Thinking about the size of numbers becomes important when working with neural networks. This video touches on different techniques like usin...

#bfloat16#datascience#floating#models#point#quantization
Video

Word as Image - great use of generative AI models lik...

3,481 views

2023-03-07

Word as Image - great use of generative AI models like stable diffusion to create fonts. Check out the paper at wordasimage.github.io #datas...

#datascience#fonts#generativeai#going#machinelearning#stablediffusion#word

Why you want prediction intervals instead of point pr...

3,461 views

2022-10-09

Why you want prediction intervals instead of point predictions. This is a repost because the first one was taken down. #datascience #codeto...

#codetok#datascience#machinelearning#prediction#predictioninterval#statistics

Zero shot learning #datascience #machinelearning #hug...

3,460 views

2022-05-27

Zero shot learning #datascience #machinelearning #huggingface #nlp #naturallanguageprocessing #statistics background: @rajistics #codetok

#datascience#huggingface#machinelearning#naturallanguageprocessing#nlp#statistics
Video

Q* from OpenAI is getting the hype but let's focus on...

3,458 views

2023-11-28

Q* from OpenAI is getting the hype but let's focus on the basics of their organization and the limitations of GPT-4 around planning. This vi...

#aiplanning#gpt4#largelanguagemodels#llms#open#openai#planning#qstar#role#video
Video

Let's talk about how copyright intersects large langu...

3,455 views

2023-08-19

Let's talk about how copyright intersects large language models around training LLMs outputs of LLMs and watermarking mechanisms. #datascien...

#copyright#datascience#factor#key#language#large#largelanguagemodels#latency#limits#machinelearning#meet#models#others#response#talk

Roundup of all the big headlines, hope this is fun fo...

3,448 views

2023-03-04

Roundup of all the big headlines, hope this is fun for you all. I laugh while making these, but wonder how many of you get all the refeenenc...

#apple#datascience#google#machinelearning#meta#openai
Video

Google announced Bard, but we still don’t know much...

3,437 views

2023-02-06

Google announced Bard, but we still don’t know much. It has been based on Lambda which has been around for a while. This is a safe bet, no...

#announced#chatgpt#datascience#google#largelanguagemodels#machinelearning

Software licensing

3,429 views

2025-05-10

Software licensing

#amazon#code#inside#license#robots#shortcomings#software#use#want#warehouse

Give it up to Data Engineers. #dataengineering #data...

3,423 views

2022-03-08

Give it up to Data Engineers. #dataengineering #datascience #analytics

#analytics#data#dataengineering#datascience#give#happy
Video

Training an image classifier using ü§ó transformers ...

3,412 views

2022-08-04

Training an image classifier using ü§ó transformers #datascience #analytics #codetok #deeplearning #huggingface Longer video at other site ...

#analytics#codetok#datascience#deeplearning#dumbtechnews#google#huggingface#machinelearning#microsoft#openai#using

Hyperparameter optimization or search is an important...

3,408 views

2024-06-20

Hyperparameter optimization or search is an important step in many machine learning algorithms. I cover a few of the basic approaches, inclu...

#algorithms#hyperparameter#hyperparameters#like#optimization#search

Everyone’s racing to make RAG faster — but my latest ...

3,396 views

2025-10-11

Everyone’s racing to make RAG faster — but my latest tests show that might be the wrong goal. Agentic RAG, with multiple retrievals and a re...

#accuracy#agentic#changes#make#rag#reasoning#retrieval#search#static

Oasis is an interactive generative world model based ...

3,393 views

2024-11-05

Oasis is an interactive generative world model based on diffusion transformers. It takes keyboard input and generates gameplay in an autoreg...

#decart#engine#game#gaming#generative#interactive#like#model#new#oasis#text#watch#world

Feeling overwhelmed by all the hyperparameter options...

3,375 views

2025-06-21

Feeling overwhelmed by all the hyperparameter options in XGBoost? This video walks through practical tips — from grid search and random sear...

#alone#data#hyperparameter#learn#like#llms#optimization#relationships#search#spatial#text#want

Don't get caught up in the hype. The main value for L...

3,366 views

2024-04-12

Don't get caught up in the hype. The main value for LLMs is marketing. Most of us are better off working on evaluation and prompting rather ...

#better#don#get#gpus#model#want

OpenAI made routing the secret weapon inside GPT-5 — ...

3,365 views

2025-08-23

OpenAI made routing the secret weapon inside GPT-5 — Sam Altman even admitted when it broke, the model felt dumber. Now researchers have gon...

#accuracy#avengers#cost#going#gpt#model#pro#router#routing

Llama 4 - planning and release decisions that went in...

3,363 views

2025-04-06

Llama 4 - planning and release decisions that went into the model.

#agents#contextual#mcp#million#model#one#rag#release#right#server#well
Video

With the growth of open-source LLMs many leaderboards...

3,363 views

2023-06-10

With the growth of open-source LLMs many leaderboards to rank these models are emerging. Several different methodologies are used including ...

#datascience#evaluation#getting#largelanguagemodels#leaderboards#leaderboardsgun#llms#machinelearning#models#open#production#source#use

Are sophisticated agents really better? With GPT-5 un...

3,360 views

2025-08-17

Are sophisticated agents really better? With GPT-5 unlocking Agentic AI, I break down four practical best practices—simplicity, structure, o...

#agentic#agents#best#building#practices#production#structure#systems#things#userjot

#reinforcementlearning #huggingface #datascience #dee...

3,357 views

2022-05-26

#reinforcementlearning #huggingface #datascience #deeplearning #codetok #deepqlearning - Week 1: @rajistics

#codetok#datascience#deeplearning#deepqlearning#huggingface#reinforcementlearning

Meta fumbled the open-source lead; Qwen—Alibaba Cloud...

3,349 views

2025-08-16

Meta fumbled the open-source lead; Qwen—Alibaba Cloud’s open-weight family—has taken it, with Apache-2.0 models spanning 0.6B → 235B MoE (~2...

#12#6#arena#champion#models#open#quinn#qwen#source#text#top

From the article: How do Authors’ Perceptions about t...

3,335 views

2022-11-23

From the article: How do Authors’ Perceptions about their Papers Compare with Co-authors’ Perceptions and Peer-review Decisions? #statistics...

#article#authors#compare#papers#perceptions#statistics

#duet with @hugging_face looks like we are on Tik Tok...

3,334 views

2022-06-08

#duet with @hugging_face looks like we are on Tik Tok. Go try mini Dalle, go to hF.co

#duet#hugging#like#mini#try#use

Sweep AI showed how to really make autocomplete work ...

3,323 views

2025-09-20

Sweep AI showed how to really make autocomplete work in JetBrains. They moved from plain Fill-in-the-Middle to syntax-aware training on real...

#autocomplete#build#code#decoding#jetbrains#model#real#really#showed#step#sweep#syntax

Red light cameras in Chicago.

3,317 views

2025-05-01

Red light cameras in Chicago.

#agno#anomalies#chicago#claude#dilemma#framework#gen#light#look#python#red#tickets

Getting explainability when working with transformer ...

3,310 views

2022-10-19

Getting explainability when working with transformer based vision models. Uses Captum on the backend, but makes it easy to get image attribu...

#captum#computervision#datascience#explainability#huggingface#machinelearning
Video

Reply to @declinedher long history of analyzing waste...

3,294 views

2022-01-31

Reply to @declinedher long history of analyzing wastewater for drug residue #dataanalysis #wastewater #drugs #monitoring @rajistics

#dataanalysis#declinedher#drugs#monitoring#reply#wastewater

Your weekly dose of LLM news. I liked this because it...

3,287 views

2022-11-06

Your weekly dose of LLM news. I liked this because it had interesting results with a smart approach. #datascience #machinelearning #largelan...

#approach#datascience#large#largelanguagemodels#machinelearning#results

Large language models don’t just process language—the...

3,261 views

2025-06-20

Large language models don’t just process language—they build internal spatial maps. This video breaks down the paper
“Linear Spatial World M...

#best#cube#engineering#inside#language#model#models#practices#prompt#spatial

This video explores why OpenAI’s o3 models sometimes ...

3,259 views

2025-04-17

This video explores why OpenAI’s o3 models sometimes hallucinate / fabricate actions, such as claiming to run code they cannot execute. Thes...

#actions#answer#code#don#hallucinate#models#sometimes
Video

Tensorflow playground, link in comments, #tensorflow ...

3,252 views

2022-02-04

Tensorflow playground, link in comments, #tensorflow #deeplearning #datascience #analytics #neuralnetworks

#analytics#datascience#deeplearning#neuralnetworks#playground#tensorflow

Customer lifetime value is a classic use case.

3,248 views

2025-03-04

Customer lifetime value is a classic use case.

#analysis#customer#customers#different#great#marketing#value

Understanding a confusion matrix, Part I video: @raji...

3,244 views

2022-03-07

Understanding a confusion matrix, Part I video: @rajistics #datascience #statistics #machinelearning #confusionmatrix

#confusionmatrix#datascience#false#machinelearning#statistics#threshold

In context learning, let’s dig deeper and let me know...

3,242 views

2022-12-14

In context learning, let’s dig deeper and let me know what I should do next. #machinelearning #datascience #largelanguagemodels #incontextl...

#context#datascience#incontextlearning#largelanguagemodels#learning#machinelearning

Save this! - Deep Dive on Time Series with Kolmogorov...

3,229 views

2024-11-03

Save this! - Deep Dive on Time Series with Kolmogorov-Arnold Networks (KAN) - 1. Toy Dataset Notebook: https://colab.research.google.com/dr...

#arnold#going#kan#kind#kolmogorov#les#like#mod#model#networks#nous#pour#que#see#series#set#time#vous

Highlighting BerTopic #datascience #statistics #nlp #...

3,229 views

2022-05-25

Highlighting BerTopic #datascience #statistics #nlp #huggingface #codetok

#datascience#huggingface#look#nlp#statistics#topics

Reply to @nolankeller23

3,229 views

2022-03-09

Reply to @nolankeller23

#building#data#different#lots#pipelines#scientists

The Audio Spectrogram Transformer model was proposed ...

3,215 views

2024-12-19

The Audio Spectrogram Transformer model was proposed in AST: Audio Spectrogram Transformer by Yuan Gong, Yu-An Chung, James Glass. The Audio...

#audio#model#onthisday#spectrogram#transformer#use#vision

I like notebooks for data science, but others differ....

3,203 views

2022-07-26

I like notebooks for data science, but others differ. #datascience #jupyternotebook #codetok #python

#birken#codetok#datascience#jupyternotebook#like#python

Prompt sensitivity is still a thing. This video cover...

3,200 views

2025-01-11

Prompt sensitivity is still a thing. This video covers how changes in formatting, the persuasion used in prompts, and prompt injection attac...

#formatting#injection#particular#prompt#prompts#still

Sharing my favorite data science news and resources, ...

3,189 views

2022-12-16

Sharing my favorite data science news and resources, find it bit.ly/raj_reads #machinelearning #datascience

#data#datascience#get#machinelearning#news#science

What are you favorite tips for error analysis? #datas...

3,179 views

2022-06-17

What are you favorite tips for error analysis? #datascience #statistics #analytics #machinelearning #codetok #mltok

#analytics#codetok#datascience#machinelearning#model#statistics

Then my team builds data pipelines for the next eight...

3,159 views

2022-09-17

Then my team builds data pipelines for the next eight months #datascience #dataengineering #analytics

#analytics#builds#data#dataengineering#datascience#team

I also think of this when walking through security or...

3,151 views

2025-02-06

I also think of this when walking through security or thinking about cheaters.

#aim#also#security#think#thinking#walking

Kolmogorov-Arnold Networks for Time Series. It's anot...

3,147 views

2024-11-01

Kolmogorov-Arnold Networks for Time Series. It's another way to approximate complex functions and a lot of research is trying it out: Check ...

#approach#arnold#enfoque#kan#khan#que#series#time

I love #rstats, but spend most of my time now in #pyt...

3,145 views

2022-07-21

I love #rstats, but spend most of my time now in #python #datascience #codetok #machinelearning

#codetok#datascience#love#machinelearning#python#rstats

Sentence transformers are awesome. Lets talk about th...

3,138 views

2024-06-13

Sentence transformers are awesome. Lets talk about the differences between Word2vec, Transformers, and Sentence Transformers.

#even#get#like#sentence#transformer#transformers

Ring Attention lets you split up the attention calcul...

3,130 views

2024-04-14

Ring Attention lets you split up the attention calculation across GPUs to allow much longer context lengths. LLMs are using this to scale to...

#around#attention#context#ring#ringattention#transformers

Peak ML #datascience #codetok #huggingface #gradio #h...

3,125 views

2022-08-24

Peak ML #datascience #codetok #huggingface #gradio #huggable #imageclassification

#codetok#datascience#gradio#huggable#huggingface#imageclassification

Deep dive into Devin as an AI software engineer. Chec...

3,117 views

2025-01-18

Deep dive into Devin as an AI software engineer. Check the deep dive by Answer AI: https://www.answer.ai/posts/2025-01-08-devin.html Inspi...

#answer#code#deep#devin#dive#engineer#reviewing#see#software#went
Video

Regularization is a technique to keep your model from...

3,116 views

2023-12-03

Regularization is a technique to keep your model from overfitting. It's widely used in machine learning. #datascience #statistics #regulariz...

#datascience#keep#model#pillars#prompting#regularization#statistics#technique

Your AI agent isn’t dumb. It’s forgetful. Most agents...

3,108 views

2025-12-19

Your AI agent isn’t dumb. It’s forgetful. Most agents redo the same work every run instead of learning from success. This video shows how th...

#agent#agents#agno#continual#dumb#forgetful#gonna#isn#learn#learning#plan#run#time
Video

Synthetic datasets have given me a way to understand ...

3,108 views

2023-01-25

Synthetic datasets have given me a way to understand better how to do feature selection and model explainability. Try it out sometime. #data...

#datascience#datasets#explainability#features#machinelearning#synthetic#syntheticdata
Video

Some data analysis tips: 1. Your data might be unrepr...

3,095 views

2023-08-02

Some data analysis tips: 1. Your data might be unrepresentative 2. Think about what was collected and what wasn't 3. Not all data is useful ...

#analysis#data#dataanalysis#datascience#machinelearning#tips

Visualizations for showing variation in the data or u...

3,095 views

2022-12-05

Visualizations for showing variation in the data or uncertainty. Based on Unfair Comparisons by Eli Holder. #datascience #machinelearning #s...

#datascience#datavisualization#machinelearning#people#statistics#variation

Using LLMs for tools is an emerging trend this year. ...

3,093 views

2024-04-05

Using LLMs for tools is an emerging trend this year. Cohere has focused on it with it's new Command-R+ model that focuses on enterprise use ...

#cohere#model#multi#tool#tools#use

Pair programming is some of my favorite times as a da...

3,084 views

2023-03-19

Pair programming is some of my favorite times as a data scientist. I am starting to use ChatGPT to fill that role lately. Its useful for me....

#chatgpt#codex#datascience#machinelearning#pairprogramming#way

Think you know how Reinforcement Learning for LLMs re...

3,066 views

2025-11-02

Think you know how Reinforcement Learning for LLMs really works? The secret isn't "just do more training” — it's about where your data comes...

#distillation#every#explained#feedback#learning#llms#model#policy#reinforcement#simply

Overdue for sports analytics #datascience #analytics ...

3,042 views

2022-08-12

Overdue for sports analytics #datascience #analytics #codetok #sportsanalytics #machinelearning

#analytics#bad#codetok#datascience#machinelearning#sportsanalytics

Ensemble learning with majority voting can improve de...

3,041 views

2025-03-14

Ensemble learning with majority voting can improve decision accuracy, as demonstrated when three 70%-accurate models combined outperform a s...

#accuracy#data#learning#majority#model#models#science#three#voting

Contrastive learning is common for folks working in N...

3,039 views

2022-11-03

Contrastive learning is common for folks working in NLP and images. This was new to me, so wanted to share the intuition a bit more widely. ...

#different#embedding#intuition#learning#loss#space

Feature engineering with categorical variables and ge...

3,033 views

2024-09-18

Feature engineering with categorical variables and gender cateogries feom chatgpt.

#categorical#data#gender#good#scientist#value

Leakage is omnipresent #datascience #analytics #codet...

3,032 views

2022-08-18

Leakage is omnipresent #datascience #analytics #codetok #targetleakage

#analytics#codetok#datascience#leakage#omnipresent#targetleakage

I was so focused! Data is hard. #datascience #dataa...

3,030 views

2022-06-07

I was so focused! Data is hard. #datascience #dataanalysis #statistics #codetok #mltok #machinelearning

#codetok#dataanalysis#datascience#machinelearning#mltok#statistics

In his talk, Denny Zhou outlined four strategies for ...

3,026 views

2025-08-11

In his talk, Denny Zhou outlined four strategies for improving LLM reasoning without increasing model size: eliciting reasoning via intermed...

#answer#ask#denny#fine#key#lessons#llm#model#paths#reasoning#retrieval#talk#zhou
Video

VLLM: A widely used inference and serving engine for ...

3,023 views

2024-08-17

VLLM: A widely used inference and serving engine for LLMs

#engine#inference#one#platforms#serving#used#vllm#widely

Google Colab, Kaggle, and LangChain are all great way...

3,017 views

2023-01-21

Google Colab, Kaggle, and LangChain are all great ways to start learning this weekend! #datascience #machinelearning #kaggle #googlecolab #l...

#datascience#google#googlecolab#kaggle#langchain#machinelearning

datasaurus. Remember to visualize your data.

3,010 views

2025-01-14

datasaurus. Remember to visualize your data.

#big#data#datasaurus#deal#remember#visualize

Watch out for leakage, it happens even to the best. ...

2,983 views

2022-07-10

Watch out for leakage, it happens even to the best. #datascience #statistics #dataleakage #targetleakage #machinelearning

#christmas#dataleakage#datascience#machinelearning#statistics#targetleakage

Training an image classifier using 🤗 transformers #d...

2,969 views

2022-08-04

Training an image classifier using 🤗 transformers #datascience #analytics #codetok #deeplearning #huggingface Longer video at other site us...

#analytics#codetok#datascience#deeplearning#huggingface#model

Good times, what was your first ML model? #titanic #...

2,958 views

2022-06-15

Good times, what was your first ML model? #titanic #datascience #statistics #codetok #machinelearning #rstats #python

#codetok#datascience#machinelearning#rstats#statistics#titanic

Rerankers improve search by examining whether documen...

2,956 views

2025-03-12

Rerankers improve search by examining whether documents truly answer your specific question, unlike retrievers that only match similar words...

#documents#explaining#including#instruction#question#rag#ranker#reranker#rerankers#results#retrieval#search#similar

As language models expand into fuzzier domains like m...

2,944 views

2025-08-03

As language models expand into fuzzier domains like medical advice and policy summarization, traditional training signals break down. This v...

#domains#fuzzy#learning#like#models#reason#reinforcement#reward#rubric#rubrics#tasks#teaching#training

Reminder to visualize your data with one of my favori...

2,941 views

2024-11-02

Reminder to visualize your data with one of my favorites, Anscombe's quartet

#analyze#anscombe#data#datasets#different#favorites#one#reminder#visualize

I still havent tried copilot. Have you? #datascience...

2,940 views

2022-07-27

I still havent tried copilot. Have you? #datascience #codetok #codex #copilot #python

#code#codetok#codex#copilot#datascience#python

It’s taken a while to accept this. #python #programmi...

2,930 views

2022-03-28

It’s taken a while to accept this. #python #programming #datascience

#accept#datascience#point#programming#python#taken

Reply to @pal_protty negative reinforcement and #reg...

2,926 views

2022-02-27

Reply to @pal_protty negative reinforcement and #regressiontothemean . Link to the Nylon calculus #basketball article in comments. #datasc...

#basketball#better#datascience#regressiontothemean#team#variability

This video explains temperature, one of the most cruc...

2,911 views

2025-03-28

This video explains temperature, one of the most crucial settings when working with GPT models. Temperature controls the randomness of the m...

#creative#distribution#one#results#save#temperature#thinking#time#want#won

Ensembling is key method in machine learning. This vi...

2,905 views

2023-03-15

Ensembling is key method in machine learning. This video introduces ensembling through majority voting. #datascience #machinelearning #ensem...

#accuracy#ensembling#majority#models#three#voting

Most enterprise GenAI pilots fail to deliver measurab...

2,903 views

2025-08-22

Most enterprise GenAI pilots fail to deliver measurable ROI because of structural and organizational gaps rather than model quality. The rep...

#deliver#enterprise#fail#gen#genai#gonna#people#pilots#projects#roi#tool#work

#datascience #analytics #statistics #wald #survivorsh...

2,897 views

2022-03-10

#datascience #analytics #statistics #wald #survivorshipbias

#analytics#datascience#planes#statistics#survivorshipbias#wald

Amazing how this stuff keeps getting better #datascie...

2,890 views

2022-08-14

Amazing how this stuff keeps getting better #datascience #codetok #machinelearning #codex

#better#codetok#codex#datascience#machinelearning#math

95% of GenAI pilots fail, not because of the model bu...

2,874 views

2025-11-16

95% of GenAI pilots fail, not because of the model but because of the approach. The simple playbook to actually scale is Usability, Utility,...

#enterprise#fail#gen#genai#model#need#one#pilots#successful#tools#transformation#value
Video

#duet with @the.rachel.woods #rachelwoods Go hustle b...

2,858 views

2023-04-14

#duet with @the.rachel.woods #rachelwoods Go hustle but don’t take it personally when they dont respond. Instead wait your time. And then ...

#don#duet#hustle#rachel#rachelwoods#woods

GEPA uses reflection to optimize prompts instead of r...

2,854 views

2025-10-13

GEPA uses reflection to optimize prompts instead of retraining. In Intrinsic Labs’ OCR study, it analyzed its own extraction errors—like mis...

#gepa#intrinsic#labs#like#ocr#prompts

LLMs can streamline existing AI/ML operations by repl...

2,852 views

2025-02-22

LLMs can streamline existing AI/ML operations by replacing specialized models (e.g., BERT for classification, SpaCy for NER, T5 for summariz...

#around#comes#data#human#like#limits#llms#models#pipelines#replacing#science#traditional

Text to video models including text2video. The models...

2,843 views

2023-03-26

Text to video models including text2video. The models are grtting better and there is now a place over at the hugging face hub to find them....

#datascience#machinelearning#models#text#text2video#video

Replying to @urdar635 watermarking output from AI mod...

2,840 views

2022-12-09

Replying to @urdar635 watermarking output from AI models is something that is being considered. It’s done by adding some “signal” to the out...

#chatgpt#datascience#machinelearning#openai#output#watermarking

Feature Selection Deep Dive, Notebook at: https://bit...

2,836 views

2024-10-13

Feature Selection Deep Dive, Notebook at: https://bit.ly/raj_fs

#feature#features#going#see#selection#use

Let’s talk about common challenges in human annotatio...

2,815 views

2025-05-05

Let’s talk about common challenges in human annotation for AI training data, particularly around ambiguous label definitions and inconsisten...

#agreement#annotation#annotator#around#best#definitions#don#label#practices
Video

Reply to @grahamkechnie #wastewater #cornovirus #anal...

2,803 views

2022-02-03

Reply to @grahamkechnie #wastewater #cornovirus #analysis #cryptic

#analysis#cornovirus#cryptic#grahamkechnie#reply#wastewater

Big data bowl submissions are going in and lots of gr...

2,797 views

2023-01-12

Big data bowl submissions are going in and lots of great sports analytic work. This one is on strain for evaluating pass rushers. #datascien...

#bigdatabowl#datascience#defensive#nfl#statistics#strain

How to Build a State-of-the-Art Retrieval System? Les...

2,791 views

2025-01-04

How to Build a State-of-the-Art Retrieval System? Lessons from Kaggle's Top Solution 👀 Let's look at the winning solution from Raja Biswas ...

#agent#billion#data#explained#going#holistic#laws#leaderboard#model#models#quickly#retrievers#scaling#synthetic#use#used
Video

#onthisday a classic debate notebooks versus scripts

2,775 views

2023-07-26

#onthisday a classic debate notebooks versus scripts

#classic#debate#notebooks#onthisday#scripts#versus
Video

Exclusive interview with openAI asking all the questi...

2,773 views

2023-05-19

Exclusive interview with openAI asking all the questions you wished the ask. Including: What's the deal with the name? How do you feel about...

#asking#data#datascience#exclusive#interview#machinelearning#may#open#openai#skit#source

This skit highlights the gap between traditional mode...

2,764 views

2025-07-27

This skit highlights the gap between traditional model evaluation metrics (like precision) and real-world deployment concerns. The developer...

#also#aren#business#cost#enough#evaluation#gap#going#highlights#metrics#model#need#precision#skit

How are you using similarity search? Looking at Spoti...

2,748 views

2024-06-26

How are you using similarity search? Looking at Spotify's annoy for nearest neighbor search for embeddings. #spotify #annoy #embeddings #ra...

#annoy#data#embeddings#numbers#scientists#spotify

NASA uses generative AI for manufacturing parts for s...

2,744 views

2024-10-28

NASA uses generative AI for manufacturing parts for space. It’s a great use of generative technology and you can start seeing how it will ch...

#61#design#generative#generativeai#intro#manufacturing#march#nasa#trends#use

This video illustrates the limitations of long-contex...

2,736 views

2025-04-13

This video illustrates the limitations of long-context LLMs across real benchmarks. While models like GPT-4o perform well on retrieval tasks...

#check#context#every#gpt#llm#long#models#needle#new#things#three#tokens#top

When people ask, "What's the best AI algorithm?" the ...

2,727 views

2025-06-25

When people ask, "What's the best AI algorithm?" the real answer is... it depends. I explain this using a search-and-rescue analogy (finding...

#algorithm#best#kaggle#linear#one#regression#search
Video

Riveter üí™ is a Python package that measures social...

2,726 views

2023-08-08

Riveter üí™ is a Python package that measures social dynamics between personas mentioned in a collection of texts. Check it out at: https:/...

#maartensap#measures#package#python#riveter#social

Reply to @garlic_gworl #fakeai #datascience #mechical...

2,672 views

2022-01-23

Reply to @garlic_gworl #fakeai #datascience #mechicalturk #aiethics #labor #ai

#actually#aiethics#datascience#fakeai#labor#mechicalturk

Deep Dive (1 hour) on Evaluation for Generative AI

2,662 views

2025-05-18

Deep Dive (1 hour) on Evaluation for Generative AI

#airbnb#behind#dirty#evaluation#example#generative#gonna#google#greatest#kind#learning#like#models#secret#things#use#using#veo

Highlight great research from #anthropic studying the...

2,659 views

2022-12-21

Highlight great research from #anthropic studying the behavior of large language models. #machinelearning #datascience #largelanguagemodels

#anthropic#datascience#language#large#machinelearning#models
Video

urse of dimensionality reminds us to think carefully ...

2,657 views

2023-02-11

urse of dimensionality reminds us to think carefully about feature selection. More isn’t always better. Use a feature selection curve. #da...

#curseofdimensionality#datascience#feature#features#featureselection#machinelearning#model#selection

Compute keeps getting cheaper. GPUs keep getting fast...

2,653 views

2025-12-17

Compute keeps getting cheaper. GPUs keep getting faster. So why do bigger models feel less efficient? This video breaks down a real technica...

#barbarians#beating#better#cheaper#compute#faster#gate#getting#gpus#humans#keep#keeps#models#research#systems

In this comprehensive talk (adapted from my presentat...

2,649 views

2025-11-02

In this comprehensive talk (adapted from my presentation at ODSC), I provide a practical, hands-on framework for evaluating your GenAI and L...

#anomaly#dataset#detection#different#going#kaputt#language#like#model#models#one#using#vision#visual#want

Recent work in Text-to-SQL shows that once you get pa...

2,647 views

2024-08-31

Recent work in Text-to-SQL shows that once you get past demo datasets, the performance drops. By incorporating human expertise, you can buil...

#applications#different#evaluating#experts#generative#going#guide#kind#like#performance#practical#see#snowflake#sql#text#updated#want#work

Great tips to save money, processing time, and improv...

2,632 views

2024-06-08

Great tips to save money, processing time, and improve speed in the FrugalGPT paper. The video covers three types of strategies to reduce th...

#frugalgpt#great#llm#model#tips#use

Get shiny to run on hugging face spaces (or even some...

2,625 views

2023-01-15

Get shiny to run on hugging face spaces (or even some other web app) #huggingface #posit #rstudio #shiny #datascience

#datascience#get#huggingface#posit#rstudio#shiny

Yikes: Gen AI is not that easy. Going over some recen...

2,616 views

2024-06-30

Yikes: Gen AI is not that easy. Going over some recent stories about difficulty getting Generative AI to production. On vacation this week,...

#data#gen#generative#going#need#new

Models and datasets have specific definitions. Models...

2,613 views

2024-05-15

Models and datasets have specific definitions. Models consist of at least two licenses nowadays, this has been an issue for LLaMA where the ...

#content#copyright#data#laoin#llama#sets

Find all 5 metrics mentioned? Unit test approach is a...

2,612 views

2024-06-25

Find all 5 metrics mentioned? Unit test approach is a great way of thinking about evaluations for generative AI. Check out my yt for a longe...

#evaluation#functionalcorrectness#huggingface#largelanguagemodels#unit#unittests

Many techniques for text similarity including lexical...

2,585 views

2024-09-28

Many techniques for text similarity including lexical, semantic, and has strategies. Here is a short list of some popular methods: 1. FuzzyW...

#fuzzywuzzy#github#jellyfish#rapidfuzz#scale#something

Need a deeper dive on retrieval: I have a video on YT...

2,584 views

2025-01-03

Need a deeper dive on retrieval: I have a video on YT I will post here this weekend. Or go check it out if you want it sooner: https://youtu...

#data#eedi#kaggle#models#one#retrievers#used

What happened at AWS Reinvent? I got to catch up with...

2,566 views

2024-12-09

What happened at AWS Reinvent? I got to catch up with a ton of great people and companies. Next time, I will take more pictures. I really re...

#aws#catch#experience#face#got#reinvent#take#ton

Feature engineering

2,546 views

2025-03-02

Feature engineering

#automated#data#engineering#feature#hands#might#model#notebook#openfe#raw#using#walkthrough

Explanations for transformers gently #datascience #co...

2,543 views

2022-08-18

Explanations for transformers gently #datascience #codetok #deeplearning

#codetok#data#datascience#deeplearning#explanations#features

Isolation Forests are an anomaly detection algorithm ...

2,527 views

2024-08-03

Isolation Forests are an anomaly detection algorithm that builds trees to partition data, isolating "lonely" points or outliers with fewer p...

#anomalydetection#data#detection#isolation#lonely#madeinhotelroom

The Keras versus Pytorch benchmarking drama. This isn...

2,527 views

2024-04-06

The Keras versus Pytorch benchmarking drama. This isn't about picking sides. I want to point out how difficult it is to do these sorts of b...

#benchmarks#compare#going#keras#python#pytorch

Getting prediction intervals with conformal predictio...

2,526 views

2022-09-21

Getting prediction intervals with conformal prediction. This is a very simple intro, it can do much more. #datascience #statistics #predicti...

#data#datascience#let#prediction#predictioninterval#statistics

ComfyUI

2,524 views

2025-03-07

ComfyUI

#amazing#comfy#comfyui#easy#images#let#start

Synthetic data improves model performance only when i...

2,520 views

2025-10-08

Synthetic data improves model performance only when it expands coverage rather than replicating existing distributions. Using LLMs to genera...

#data#evaluation#examples#expands#going#improves#llm#model#performance#synthetic#think#training#use

AG (Retrieval Augmented Generation) addresses fundame...

2,514 views

2025-01-26

AG (Retrieval Augmented Generation) addresses fundamental limitations of base LLMs like ChatGPT, which can generate incorrect technical info...

#answer#beam#better#context#documentation#explained#getting#grounded#like#oldie#pretty#rag#relevant#search#still#text
Video

Intro for AI Literacy #datascience #machinelearning #...

2,496 views

2022-01-14

Intro for AI Literacy #datascience #machinelearning #ai #programming #literacy #alleninstitute

#ai#alleninstitute#datascience#literacy#machinelearning#programming

5 things to look for when a new model is announced Li...

2,492 views

2025-04-19

5 things to look for when a new model is announced License, Size of the Model, Benchmarks, Training Data/Details, Fine-Tuning & Tech Specs

#benchmarks#billion#context#data#fine#including#license#llm#long#model#new#nolima#training

Got some time this weekend? Go build a web demo. #dat...

2,491 views

2022-11-25

Got some time this weekend? Go build a web demo. #datascience #statistics #shinyr #rstats #python #gradio #streamlit

#datascience#fuck#powerpoint#rstats#shinyr#statistics

Open source can be a lot of work #opensource #techtok...

2,477 views

2022-04-15

Open source can be a lot of work #opensource #techtok #programming #python #github

#another#github#opensource#programming#python#techtok

It’s not easy to answer questions. Techniques like mu...

2,467 views

2025-01-10

It’s not easy to answer questions. Techniques like multi retrieval and multi hop we use every day without thinking about it. However, with A...

#1#answer#eedi#hop#kaggle#like#multi#questions#rag#retrieval#secrets#solution#use

Statistics leverages randomness across machine learni...

2,459 views

2025-01-13

Statistics leverages randomness across machine learning applications, including random forests, dropout in neural networks, and hyperparamet...

#data#evolving#learning#like#random#randomness#reinforcement#rler#rubrics#sampling#statistics#tulu#using

Don't overspend getting into data science. This eposi...

2,450 views

2024-08-20

Don't overspend getting into data science. This eposide is dedicated to the snap-on and ikon controversy. Reminders: Basic macbook, Google c...

#data#don#get#gpus#macbook#science

AI agents used to shut down mid-task or hallucinate v...

2,446 views

2025-07-06

AI agents used to shut down mid-task or hallucinate vending empires.
Now? They're beating humans at long-horizon business simulations. From ...

#agentcompany#agents#benchmark#claude#learning#like#task#vending#work

This video explains why FP16 (16-bit floating point) ...

2,445 views

2025-05-16

This video explains why FP16 (16-bit floating point) isn't always suitable for training neural networks due to instability caused by limited...

#bit#brain#floating#learning#models#point#potential#spark#training#transfer#unlocked
Video

Explanation Approaches for Transformers

2,439 views

2022-08-12

Explanation Approaches for Transformers

#approaches#audiomodels#datascience#explanation#huggingface#machinelearning#speechmodels#speecht5#transformers
Video

What a data scientist does #datascience #analytics #c...

2,432 views

2022-09-04

What a data scientist does #datascience #analytics #codetok #python

#analytics#codetok#data#datascience#loss#python#regression#scientist#statistics

Flux unifies text-to-image and image editing in a sin...

2,431 views

2025-09-29

Flux unifies text-to-image and image editing in a single model. By working in latent space, using flow matching, and applying adversarial di...

#diffusion#editing#flex#flux#generation#generative#image#model#text#unifies

Visualizing decision trees with dtreeviz. Check out t...

2,421 views

2024-12-28

Visualizing decision trees with dtreeviz. Check out their GitHub and it’s pip install dtreeviz. If you see a cooler way to do this, let me k...

#check#decision#dtreeviz#see#tree#trees

OpenFE is an automated Feature Engineering package. I...

2,419 views

2024-09-08

OpenFE is an automated Feature Engineering package. I found out about this through Kaggle, check out the notebook for all the links: Noteboo...

#dataset#feature#features#notebook#openfe#well

LLM-as-a-judge isn’t broken. Our mental model is. Ins...

2,416 views

2025-12-28

LLM-as-a-judge isn’t broken. Our mental model is. Instead of fixing the judge with prompts, this video shows how calibration can turn cheap,...

#broken#calibrating#cheap#fixing#gold#instead#isn#judge#judges#llm#mental#model#using

Video models like Veo-3 demonstrate zero-shot reasoni...

2,416 views

2025-10-04

Video models like Veo-3 demonstrate zero-shot reasoning across four emergent abilities: Perception (understanding visual scenes), Modeling (...

#learners#make#models#reasoning#shot#veo#video#zero

A debate whether AI evals are worth the effort. The H...

2,416 views

2025-09-06

A debate whether AI evals are worth the effort. The Hacker says benchmarks don’t reflect reality, eval sets are brittle, and vibes or natura...

#debate#dettmers#don#efficiency#effort#eval#evals#scaling#sets#signals#versus#vibes#whether#worth

A simple explanation of what AI is. The video touches...

2,413 views

2024-06-26

A simple explanation of what AI is. The video touches upon the impact of AI, how AI works with a practical example, and some of the reasons ...

#ai#aiexplained#datascience#machinelearning#onthisday#years

Having some fun connecting a spreadsheet to a ML mode...

2,409 views

2022-11-04

Having some fun connecting a spreadsheet to a ML model. It wasn’t too hard and it’s pretty cool to have it working this way. #datascience #...

#code#datascience#huggingface#machinelearning#model#spreadsheet

#onthisday

2,405 views

2024-12-17

#onthisday

#data#got#onthisday#really#science#somebody

Earthquake visualization from lazarusA #datascience #...

2,400 views

2022-02-01

Earthquake visualization from lazarusA #datascience #datavisualization #visualization #julia #python #earthquakes

#datascience#datavisualization#earthquakes#julia#python#visualization

Check out my earlier videos on Block World. The lates...

2,387 views

2024-09-24

Check out my earlier videos on Block World. The latest paper is: LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on...

#benchmark#blockworld#got#like#models#mysteryworld

Fine Tuning Sentence Transformers MedEmbed: Fine-Tune...

2,379 views

2024-10-25

Fine Tuning Sentence Transformers MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR: https://huggingface.co/blog/abhinand/mede...

#data#fine#medical#model#models#tuning

I have no attention span. How will I learn from these...

2,375 views

2022-09-26

I have no attention span. How will I learn from these videos? #datascience #codetok #python

#attention#codetok#datascience#learn#python#span
Video

Fixing Imbalanced Data in Machine Learning

2,367 views

2024-10-17

Fixing Imbalanced Data in Machine Learning

#basic#data#datasets#fixing#handling#imbalanced#learning#machine#smote#techniques

We keep building AI copilots that look great in demos...

2,366 views

2025-12-26

We keep building AI copilots that look great in demos and fail in the real world. This skit shows a common mistake: designing for chat, pers...

#building#chat#copilot#copilots#demos#fail#failed#get#great#keep#look#need#want

Shallow learning with tensorflow playground #datascie...

2,364 views

2022-02-17

Shallow learning with tensorflow playground #datascience #tensorflow #python #machinelearning #deeplearning

#datascience#deeplearning#like#machinelearning#python#tensorflow
Video

Google’s sparrow is the rumored competitor to OpenA...

2,361 views

2023-01-21

Google’s sparrow is the rumored competitor to OpenAI ChatGPT. Check out the paper to see lots of examples of it chatting. It looks really ...

#chatgpt#datascience#google#googlesparrow#machinelearning#openai#sparrow

Go explore if you are new #datascience #techtok #anal...

2,359 views

2022-04-14

Go explore if you are new #datascience #techtok #analytics

#analytics#data#datascience#scientist#start#techtok

Any old school SAS users out there? #datascience #sta...

2,353 views

2022-04-21

Any old school SAS users out there? #datascience #statistics #sas

#datascience#old#sas#school#statistics#users

Ai Engineering tips. I have witnessed many of these.

2,352 views

2025-03-15

Ai Engineering tips. I have witnessed many of these.

#accurate#automated#engineering#feature#fetch#github#hours#model#next#openfe#spend#tips#tools

Quick reminder that lots of academic work doesnt last...

2,347 views

2024-10-29

Quick reminder that lots of academic work doesnt last very long.

#academic#doesnt#lots#quick#reminder#work

Baseline models are important when comparing differen...

2,342 views

2024-04-07

Baseline models are important when comparing different models. Benchmark datasets are handy to for seeing how a model does for a specific sc...

#baseline#baselinemodel#benchmark#datascience#machinelearning#model

OpenAI’s GPT-5 Codex bakes in adaptive compute — trim...

2,326 views

2025-09-16

OpenAI’s GPT-5 Codex bakes in adaptive compute — trimming steps for simple edits and expanding for complex coding, all inside one model. But...

#adaptive#codex#compute#gpt#model#openai#router#semantic#vllm

When creating simulated data, you have complete contr...

2,324 views

2025-01-20

When creating simulated data, you have complete control over which elements represent meaningful patterns and which represent random variati...

#complete#creating#data#features#know#makes#model#represent#set#simulated
Video

I like to stay practical and plenty to get excited ab...

2,321 views

2022-08-10

I like to stay practical and plenty to get excited about and get worries about without AGI. AGI is artifical general intelligence and the id...

#agi#artificialintelligence#codetok#datascience#get#like

Claude is amazing, but still plenty of room for AI to...

2,320 views

2024-07-04

Claude is amazing, but still plenty of room for AI to improve. Let's dig into two challenging benchmarks for LLMs, Connections and MUSR. The...

#challenging#connections#llms#musr#reasoning#still

A year later and not much has changed.

2,315 views

2024-11-19

A year later and not much has changed.

#alternative#get#going#gonna#know#open#weekend

Applying a classic methodology of ablation when worki...

2,310 views

2022-11-12

Applying a classic methodology of ablation when working with stable diffusion prompts. Ablation is very common in many techniques to underst...

#ablationsurgery#datascience#machinelearning#removing#stablediffusion#statistics

In this satirical video, a customer requests a modifi...

2,288 views

2025-04-10

In this satirical video, a customer requests a modified ChatGPT aligned with their political views, and the vendor explains various technica...

#human#lamp#like#llms#math#model#options#reliability#skills#time#told#using#wait
Video

Applying a classic methodology of ablation when worki...

2,285 views

2022-11-12

Applying a classic methodology of ablation when working with stable diffusion prompts. Ablation is very common in many techniques to underst...

#ablationsurgery#datascience#machinelearning#removing#stablediffusion#statistics

Predict social outcomes is not doable by #ai #ethics ...

2,280 views

2022-05-11

Predict social outcomes is not doable by #ai #ethics #bias #datascience #statistics #snakeoil

#ai#bias#datascience#ethics#snakeoil#statistics

Recent studies have shown that AI can be more persuas...

2,279 views

2024-11-23

Recent studies have shown that AI can be more persuasive compared to other humans. #onthisday

#coordinates#diplomacy#figured#humans#onthisday#play#trained

A couple of techniques we use to compress models. Thi...

2,270 views

2025-02-01

A couple of techniques we use to compress models. This saves GPU memory and can reduce the amount of compute needed. Model distillation comp...

#compress#couple#doesn#memory#model#need#parts#smaller#techniques#use

Comparing different data formats—Tabular, Unstructure...

2,261 views

2025-03-16

Comparing different data formats—Tabular, Unstructured, and JSON—as characters debating their roles in the AI revolution. Tabular data feels...

#data#formats#json#need#revolution#tabular#unstructured

Evaluation is critical for LLMs and there is an entir...

2,251 views

2024-11-06

Evaluation is critical for LLMs and there is an entirely new generation of evaluation applications coming. Here I show off Braintrust, which...

#application#braintrust#critical#data#entirely#evaluation#llms#model#need#new#performance#see

RAG doesn't work out of the box! There are many possi...

2,248 views

2024-06-01

RAG doesn't work out of the box! There are many possible issues with the answers. In this paper, researching RAG in the legal context, the a...

#hallucination#hallucinations#legal#llms#paragraph#rag

The video pokes fun at the hype and fear surrounding ...

2,247 views

2025-04-01

The video pokes fun at the hype and fear surrounding GPT-4, AI job loss, and tech sensationalism. It reminds data professionals that most re...

#analytics#data#deep#explainability#gpt#hype#interpretability#know#learning#machine#model#models#still

Model Risk Management (MRM), important but can be fru...

2,244 views

2022-03-15

Model Risk Management (MRM), important but can be frustrating. #datascience #regulatedindustries #explainability #statistics

#datascience#explainability#model#mrm#regulatedindustries#statistics

What happens when humans stop fearing AI—and start le...

2,237 views

2025-06-10

What happens when humans stop fearing AI—and start learning from it?
This video explores how superhuman AI didn’t just beat humans at Go or ...

#better#crossing#data#ethical#fails#human#new#novel#science#social#superhuman#use
Video

Use Cases for Generative AI / LLMs that are in Produc...

2,237 views

2024-07-25

Use Cases for Generative AI / LLMs that are in Production

#cases#esto#generative#las#llms#los#modelos#para#production#que#use

Pandas versus Polars Check out: https://github.com/p...

2,236 views

2024-11-28

Pandas versus Polars Check out: https://github.com/pola-rs/polars Polars vs. pandas: What’s the Difference? https://blog.jetbrains.com/pych...

#comparison#fast#group#pandas#polars#query#quick#single#try#using#versus
Video

Latency is a key factor but there are others when thi...

2,235 views

2023-08-15

Latency is a key factor but there are others when thinking about deploying large language models. Let's discuss tradeoffs between latency th...

#accuracy#explainability#factor#interpretability#key#latency#learning#limits#machine#others#response#versus

Target leakage is a very common problem, and everyone...

2,231 views

2025-04-12

Target leakage is a very common problem, and everyone should understand it. Even the smartest people and best teams have issues with target ...

#data#dataset#leakage#model#set#target#training#want

Fundamentals folks. A great example is the paper on p...

2,224 views

2025-05-29

Fundamentals folks. A great example is the paper on police misconduct. It highlights a lot of great data science practices (more than I coul...

#analyst#battle#data#learning#machine#misconduct#model#police#scientist

Profit Curve, See earlier parts on Classification Mar...

2,220 views

2022-03-15

Profit Curve, See earlier parts on Classification Martrics here: @rajistics @rajistics #datascience #statistics #confusionmatrix

#confusionmatrix#datascience#false#profit#see#statistics

Defending prompting for large language models. I will...

2,193 views

2025-05-28

Defending prompting for large language models. I will post links in the morning over at yt and redddit.

#defense#language#like#models#prompting#prompts#real#writing
Video

Quick intro, let me know if a deeper dive is useful. ...

2,186 views

2022-07-19

Quick intro, let me know if a deeper dive is useful. #translation #meta #datascience #machinelearning #huggingface

#datascience#huggingface#machinelearning#meta#quick#translation

YouChat. Looks impressive, I will try it out this wee...

2,168 views

2022-12-24

YouChat. Looks impressive, I will try it out this weekend and let you know.

#impressive#let#looks#try#weekend#youchat

Anthropic is starting to preview their model and peop...

2,166 views

2023-01-07

Anthropic is starting to preview their model and people are comparing it to ChatGPT. Thanks to Riley Goodside for sharing screenshots. It lo...

#anthropic#chatgpt#claude#datascience#largelanguagemodels#machinelearning

Llama 3.1

2,161 views

2024-07-23

Llama 3.1

#going#llama#meta#million#models#open#tomorrow

Data science is a pretty awesome job. Much better th...

2,159 views

2022-10-07

Data science is a pretty awesome job. Much better than my past jobs of working thr IT helpdesk or painting rocks. #datascience #analytics #...

#analytics#awesome#data#datascience#science#statistics

Automated Feature Engineering has lots of great tools...

2,151 views

2024-09-02

Automated Feature Engineering has lots of great tools. But remember, automation isn't a full substitute for human expertise and subject matt...

#approach#automated#engineering#feature#openfe#paper

Sql just doesn’t go away and is hipper than ever. #da...

2,151 views

2022-10-18

Sql just doesn’t go away and is hipper than ever. #datascience #dataengineering

#back#coming#dataengineering#datascience#doesn#sql

Hugging Face’s INTIMA benchmark tests how AI handles ...

2,142 views

2025-08-26

Hugging Face’s INTIMA benchmark tests how AI handles emotional boundaries—and the results are worrying. Across 368 prompts, major models oft...

#across#benchmark#benchmarking#blog#companions#don#face#hugging#intima#models#real

Common crawl dataset.

2,140 views

2025-06-07

Common crawl dataset.

#common#crawl#data#dataset#july#mistakes#pages#reduce#set#version

Active Learning prioritizes labeling the most informa...

2,130 views

2025-05-18

Active Learning prioritizes labeling the most informative data points—typically those near the decision boundary—based on model uncertainty....

#active#boundary#context#data#decision#fine#impact#labeling#learning#length#prompting#tuning

A couple of examples of what not to do and what you s...

2,129 views

2022-12-23

A couple of examples of what not to do and what you should do when presenting your data science results to the business. #datascience #stati...

#campaigns#datascience#going#machinelearning#marketing#statistics

Some good lessons in Amazon's efforts to automate war...

2,119 views

2025-05-15

Some good lessons in Amazon's efforts to automate warehouse item stowage. Despite sophisticated hardware, vision systems, and algorithms, th...

#active#automate#data#going#good#humans#items#labeling#learning#less#small#smarter

A fun breakdown of the three split methods in XGBoost...

2,114 views

2024-10-27

A fun breakdown of the three split methods in XGBoost—Exact, Approx, and Histogram—and how each speeds up model training. See which methods ...

#benefits#das#die#exact#histogram#ich#ist#magic#methods#neural#quantization#research#sie#three#training#und#xgboost

Human in the loop is important, but it's not a silver...

2,114 views

2024-09-01

Human in the loop is important, but it's not a silver bullet. #aiethics #tesla #cigna #rajistics Cigna: https://www.healthcaredive.com/news...

#aiethics#cigna#human#loop#tesla#using

AI is starting to outperform humans in surprising pla...

2,110 views

2025-12-13

AI is starting to outperform humans in surprising places: ad creative, systems optimization, even algorithm design. But look closer and a pa...

#evals#even#humans#model#problem#problems#run#still#systems

Feature Selection Methods: A critical part of machine...

2,107 views

2024-09-21

Feature Selection Methods: A critical part of machine learning is identifying the best set of features. Some popular techniques include: Bor...

#feature#features#invite#many#party#selection

VLLM is one of the most widely used serving platforms...

2,087 views

2024-08-17

VLLM is one of the most widely used serving platforms for LLMs. It's also very easy to get started with. Check it out if you are hosting you...

#also#choose#demure#don#mindful#vllm

Cryptic error messages. Cmon. Give it up for actionab...

2,086 views

2022-03-12

Cryptic error messages. Cmon. Give it up for actionable error messages that make coding a downhill sport.

#actionable#cmon#cryptic#error#give#messages

Sparsity is the concept of leveraging zeros in data a...

2,083 views

2025-03-01

Sparsity is the concept of leveraging zeros in data and models for efficiency. Sparse representations enable working with massive datasets b...

#data#experts#models#networks#sparse#zero#zeros

Kubernetes. Good to know.

2,080 views

2025-01-23

Kubernetes. Good to know.

#don#good#know#kubernetes#set#skill#understand

Feature engineering rant

2,080 views

2025-02-28

Feature engineering rant

#deep#engineering#feature#model#people#rant#working

#opensource #explainability #datascience #statistics ...

2,080 views

2022-05-15

#opensource #explainability #datascience #statistics #codetok #programming my intro video: @rajistics

#codetok#datascience#explainability#opensource#programming#statistics

Basic techniques for handling imbalanced datasets: NO...

2,077 views

2024-10-17

Basic techniques for handling imbalanced datasets: NOT SMOTE Metrics sensitive to imbalance Algorithms robust Upsampling for large datasets ...

#class#create#data#examples#fraud#weights

Robotics won’t scale like LLMs until perception, eval...

2,073 views

2025-11-21

Robotics won’t scale like LLMs until perception, evaluation, and embodiment align. We still compress away crucial spatial information, measu...

#compress#doesn#like#llms#lms#robotics#robots#scale#scaling#still#won

Your regular reminder that you should translate the i...

2,073 views

2022-06-04

Your regular reminder that you should translate the impact of your model into something your stakeholders care about. #datascience #statist...

#analytics#codetok#datascience#regular#reminder#statistics

Limits of AI around compute, memory, and interconnect...

2,068 views

2024-09-06

Limits of AI around compute, memory, and interconnection bandwidth. AI and Memory Wall - https://arxiv.org/pdf/2403.14123 Fire-Flyer AI-HPC:...

#bandwidth#compute#cost#hardware#memory#years

Deep dive - save this - Going into LLM reasoning mode...

2,066 views

2025-04-28

Deep dive - save this - Going into LLM reasoning models. This is longer form and my talk track isn’t as tight. But lots of you like seeing t...

#going#kind#like#models#see#thinking#want#well

The Physics of Language Models by Zeyuan Allen-Zhu Ch...

2,062 views

2024-11-24

The Physics of Language Models by Zeyuan Allen-Zhu Check out: ICML 2024 Tutorial: Physics of Language Models - https://youtu.be/yBL7J0kgldU?...

#extracting#knowledge#language#model#models#physics#pre#trump
Video

Text to Chart. It’s easier than ever to build great...

2,059 views

2023-02-15

Text to Chart. It’s easier than ever to build great charts using libraries like plotly or matplotlib. Are other people using ChatGPT for t...

#chatgpt#datascience#like#machinelearning#matplotlib#plotly#python#stackoverflow
Video

Stackoverflow and Github Copilot

2,059 views

2023-07-25

Stackoverflow and Github Copilot

#copilot#github#stackoverflow
Video

Composer will be sharing their new generative AI mode...

2,058 views

2023-02-26

Composer will be sharing their new generative AI models and they look amazing. They key is they decompose the image, which then provides a l...

#composer#datascience#generativeai#going#machinelearning#new#stablediffusion

Reply to @misho9000 anomaly detection is hard #datas...

2,055 views

2022-05-01

Reply to @misho9000 anomaly detection is hard #datascience #statistics #techtok #anomalydetection #machinelearning

#anomaly#anomalydetection#datascience#machinelearning#statistics#techtok

Microsoft’s chatbot meltdown showed what happens when...

2,048 views

2025-09-24

Microsoft’s chatbot meltdown showed what happens when AI runs without oversight. In this skit, our “naïve vs. expert” duo break down why hum...

#human#humans#like#loop#monitoring#need#people#things#useful
Video

GPT-3 is powerful, but sometimes domain-specific mode...

2,036 views

2023-01-26

GPT-3 is powerful, but sometimes domain-specific models will do better. Pick the right tool for the job. #datascience #machinelearning #hugg...

#chatgpt#datascience#gpt3#huggingface#machinelearning#model#openai

This was from last year, but still holds up.

2,035 views

2024-10-19

This was from last year, but still holds up.

#annotation#data#gpt#labeling#model#refuel
Video

Async for Python (why you want to use asyncio)

2,034 views

2025-11-23

Async for Python (why you want to use asyncio)

#async#asyncio#clip#datascience#huggingface#interrogator#machinelearning#python#stablediffusion#use#want

Beam search improves text generation by considering m...

2,028 views

2025-03-25

Beam search improves text generation by considering multiple candidate sequences instead of just picking the highest probability token at ea...

#beam#better#different#evan#greedy#john#like#paths#peck#search#tips#visualizations

It's not so easy, I use to do research on this crime ...

2,021 views

2024-07-05

It's not so easy, I use to do research on this crime in Chicago

#able#behavior#bigcodebench#chicago#crime#data#evaluating#gen#generative#going#need#new#numbers#statistics#testing#unit

Understanding Entropy and Information Gain in Machine...

2,021 views

2025-04-25

Understanding Entropy and Information Gain in Machine Learning This video explains how entropy measures disorder or uncertainty in machine l...

#credit#entropy#learning#liability#machine#rating

Trees are so nice to work, but dont forget these step...

2,012 views

2022-07-15

Trees are so nice to work, but dont forget these steps for other algorithms. #datascience #xgboost #randomforest #statistics #machinelearni...

#codetok#datascience#machinelearning#randomforest#statistics#xgboost

Q 4,5, and 7 from the Allen Institute survey #datasci...

2,005 views

2022-02-23

Q 4,5, and 7 from the Allen Institute survey #datascience #medialiteracy #ai @rajistics @rajistics @rajistics

#actually#ai#allen#based#datascience#medialiteracy

Spend time looking at your data. Data is varied and y...

1,997 views

2025-01-01

Spend time looking at your data. Data is varied and you want to account for patterns like bimodal or zero inflated distributions. As the vid...

#bimodal#damage#data#flood#flooding#model
Video

Zero-shot object detection. #datascience #codetok #hu...

1,989 views

2022-08-09

Zero-shot object detection. #datascience #codetok #huggingface #objectdetection #deeplearning #zeroshotclassification

#codetok#datascience#deeplearning#dose#huggingface#largelanguagemodels#llm#machinelearning#objectdetection#weekly#zeroshotclassification

Myth versus Reality. #sql #datascience #analytics

1,984 views

2022-03-25

Myth versus Reality. #sql #datascience #analytics

#analytics#datascience#don#know#myth#sql

Stable diffusion 2.0 just dropped and a lot of unhapp...

1,983 views

2022-11-25

Stable diffusion 2.0 just dropped and a lot of unhappy people. Who knew giving away software could create so much angst. #datascience #stab...

#datascience#diffusion#dropped#lot#stable#stablediffusion

DSBench: How Far are Data Science Agents Becoming Dat...

1,979 views

2024-09-23

DSBench: How Far are Data Science Agents Becoming Data Science Experts? A challenging benchmark to evaluate LLM systems on real-world data s...

#data#gpt#modeling#new#science#tasks

Reply to @mrjohnlueders #scrum #datascience #agile #s...

1,972 views

2022-02-02

Reply to @mrjohnlueders #scrum #datascience #agile #softwareengineering #analytics

#agile#analytics#datascience#problem#scrum#softwareengineering

GDPval is OpenAI’s new benchmark that tests AI on rea...

1,967 views

2025-09-26

GDPval is OpenAI’s new benchmark that tests AI on real professional tasks from industries that drive GDP. Models like GPT-5 are graded by hu...

#benchmark#benchmarking#exploring#gdp#gdpval#human#models#new#openai#real#tasks

This didn't happen to me recently :) To learn more: A...

1,965 views

2024-12-12

This didn't happen to me recently :) To learn more: An Empirical Analysis of the Python Package Index (PyPI) - https://arxiv.org/pdf/1907.11...

#malicious#name#package#pypi#python#reserve#squatting#super
Video

Feature engineering and data preprocessing are an imp...

1,959 views

2023-02-27

Feature engineering and data preprocessing are an important part of the machine learning process. #datascience #machinelearning #featureengi...

#data#datascience#engineering#feature#featureengineering#machinelearning#model
Video

Replika and the growth of these character chatbots or...

1,959 views

2023-02-14

Replika and the growth of these character chatbots or socialbots is emerging as a big use case within generative AI. Here is a recent contro...

#chatbot#datascience#gpt3#growth#people#replika#socialbot

An older video, but still very useful. Take time to l...

1,951 views

2024-06-19

An older video, but still very useful. Take time to look at the actual errors of your model. It’s seems obvious, but too often people just s...

#data#errors#look#model#people#take

You can’t make this stuff up. Can I just say modeling...

1,951 views

2022-04-28

You can’t make this stuff up. Can I just say modeling? #datascience #analytics #statistics #scrum #techtok

#analytics#datascience#make#scrum#statistics#techtok

#onthisday

1,950 views

2025-07-13

#onthisday

#also#contextual#demo#different#evaluating#going#like#model#models#object#onthisday#rag#retrieval#spaces#things#try
Video

Start using Llama 3.2 Vision Models with Hugging Face...

1,949 views

2024-09-29

Start using Llama 3.2 Vision Models with Hugging Face Transformers (on Snowflake)

#das#die#ein#hugging#ich#llama#models#start#und#using#vision#wir

Pricing optimization is a data science use case that ...

1,949 views

2024-07-16

Pricing optimization is a data science use case that is growing. In some areas, like many states in the United States, it is not allowed for...

#data#one#pay#price#pricing#use

Sentence Transformers - https://sbert.net/ MTEB: Mass...

1,947 views

2024-10-15

Sentence Transformers - https://sbert.net/ MTEB: Massive Text Embedding Benchmark - https://huggingface.co/blog/mteb Notebook - https://gith...

#embedding#inference#models#one#pick#text

Fairness in models #datascience #analytics #fairnessm...

1,936 views

2022-03-19

Fairness in models #datascience #analytics #fairnessml #bias #algorithms

#algorithms#analytics#bias#datascience#fairnessml#models

Replying to @darianv19 semantic search versus lexicon...

1,932 views

2022-07-05

Replying to @darianv19 semantic search versus lexicon search. Emeddings help power semantic search. #datascience #embeddings #python

#datascience#doesn#embeddings#python#search#semantic

Reply to @mat.cov05 annotator agreement puts a ceili...

1,917 views

2022-06-19

Reply to @mat.cov05 annotator agreement puts a ceiling on your model performance #datascience #statistics #codetok

#annotators#codetok#customer#datascience#review#statistics

Population Stability Index is a popular way to measur...

1,910 views

2024-05-30

Population Stability Index is a popular way to measure feature drift or data drift when monitoring machine learning models. I am doing a tal...

#cake#mlops#modelmonitoring#population#populationstabilityindex#stability
Video

#onthisday reposting an older video from last year th...

1,892 views

2023-08-22

#onthisday reposting an older video from last year that illustrates kmeans

#last#older#onthisday#reposting#video#year

Who enjoys explaining how ML models work? #machinelea...

1,879 views

2022-07-04

Who enjoys explaining how ML models work? #machinelearning #datascience #statistics #codetok

#codetok#datascience#enjoys#explaining#machinelearning#statistics

Model calibration

1,871 views

2024-11-26

Model calibration

#calibration#disease#let#model#predicting#probability

Transformers aren’t new anymore #datascience #codetok...

1,869 views

2022-06-10

Transformers aren’t new anymore #datascience #codetok #deeplearning #machinelearning #statistics

#codetok#datascience#deeplearning#machinelearning#statistics#transformers

Models that cheat, take shortcuts, and leak informati...

1,865 views

2025-01-02

Models that cheat, take shortcuts, and leak information are all part of the data scientist life style. Every data scientist has a story like...

#data#look#model#models#scientist#using

I need more time to code. #datascience #programming #...

1,862 views

2022-04-15

I need more time to code. #datascience #programming #techtok #python

#datascience#need#programming#python#techtok#time

It has happened. #datascience #codetok #machinelearni...

1,846 views

2022-05-25

It has happened. #datascience #codetok #machinelearning #analytics

#analytics#codetok#datascience#happened#machinelearning

Those GPUs. #datascience #codetok #python #analytics...

1,839 views

2022-05-04

Those GPUs. #datascience #codetok #python #analytics #aws

#analytics#aws#codetok#datascience#gpus#python
Video

Nvidia Prismer model for image captioning and zero sh...

1,835 views

2023-03-15

Nvidia Prismer model for image captioning and zero shot visual question answering. It uses and ensemble or mixture of experts approach. #dat...

#datascience#imagecaptioning#machinelearning#model#nvidia#prismer#uses#visualquestionanswering

Repost, but still useful. Some tips for deploying lar...

1,810 views

2024-07-30

Repost, but still useful. Some tips for deploying large language models like Llama. Start by building some benchmarks for your tasks to asse...

#gpus#like#model#per#quantization#tokens

We like to say “data is data” and that scale fixes ev...

1,810 views

2026-01-07

We like to say “data is data” and that scale fixes everything. This skit questions that assumption using recent work on Data Shapley. By mea...

#data#hurt#model#performance#scale#training

Checking out Flan T5 large language models. Let me kn...

1,804 views

2022-11-09

Checking out Flan T5 large language models. Let me know what wisdom you can find in this model. #machinelearning #datascience #largelanguage...

#datascience#flan#language#largelanguagemodels#machinelearning#model

Measuring progress towards AGI

1,789 views

2024-11-09

Measuring progress towards AGI

#agi#deep#google#level#like#measuring#percentile#progress#skilled#towards

Aged well - With the growth of open-source LLMs, many...

1,784 views

2024-06-10

Aged well - With the growth of open-source LLMs, many leaderboards to rank these models are emerging. Several different methodologies are us...

#different#leaderboards#many#model#models#onthisday

4 Data Science Fails.These are a handful of ways that...

1,784 views

2024-06-06

4 Data Science Fails.These are a handful of ways that society pushes back on data science approaches. It's good to understand why these were...

#algorithm#approaches#data#failed#like#model

I have done a lot of good work in untitled python not...

1,774 views

2022-06-30

I have done a lot of good work in untitled python notebooks. #datascience #machinelearning #python #codetok #thosethatgetitgetit

#codetok#datascience#done#machinelearning#python#thosethatgetitgetit

GPT- 3 trivia and French pastries I enjoyed at the 🤗...

1,761 views

2023-02-01

GPT- 3 trivia and French pastries I enjoyed at the 🤗 offsite. #datascience #machinelearning #gpt3 #openai #huggingface

#datascience#gpt#gpt3#huggingface#machinelearning#openai
Video

Pandas 2.0 combing with arrow. A short recap on how i...

1,755 views

2023-03-01

Pandas 2.0 combing with arrow. A short recap on how it fits in with polars, dplyr, and data.table. #datascience #machinelearning #rstats #py...

#datascience#dplyr#largelanguagemodels#machinelearning#meta#models#opensource#pandas#polars#rstats
Video

OpenAI and Red Wedding

1,753 views

2023-11-18

OpenAI and Red Wedding

#openai#red#wedding
Video

#onthisday Data visualization tips #datascience #data...

1,746 views

2024-03-22

#onthisday Data visualization tips #datascience #dataviz #analytics #datavisualization

#analytics#data#datascience#datavisualization#dataviz#let#onthisday

Image captioning models - GIT from Microsoft and BLIP...

1,743 views

2023-01-05

Image captioning models - GIT from Microsoft and BLIP from salesforce #datascience #machinelearning #imagecaptioning

#datascience#git#image#imagecaptioning#machinelearning#model

No click bait on this account. Feeling sick today (an...

1,733 views

2022-07-06

No click bait on this account. Feeling sick today (and upset between Roe and Highland). Mailing it in today. #datascience #statistics #analy...

#analytics#click#codingbootcamp#datascience#statistics#today

Distance Metrics in Data Science

1,731 views

2024-10-24

Distance Metrics in Data Science

#distance#find#metrics#similar#think#two

Lots of real world problems, it pays to know distribu...

1,725 views

2022-07-08

Lots of real world problems, it pays to know distributions like tweedie. Still sick, so you get old tik tok from my drafts. #datascience #...

#acturialscience#codetok#datascience#get#know#statistics

null

1,717 views

2025-02-14

null

#character#end#mark#need#null#string
Video

YouChat and retrieval augmented models. To play aroun...

1,701 views

2022-12-26

YouChat and retrieval augmented models. To play around with this, check out haystack from deepset. #datascience #machinelearning #youchat #c...

#chatgpt#datascience#information#machinelearning#openai#retrievalaugmentedmodel#youchat

credit to Gavin from work - #codetok #stackoverflow #...

1,697 views

2022-05-05

credit to Gavin from work - #codetok #stackoverflow #programming #python

#codetok#credit#gavin#programming#python#stackoverflow

Improving your data visualizations.

1,692 views

2025-03-21

Improving your data visualizations.

#data#don#improving#let#make#visualizations
Video

Replying to @Data Storyteller Here are two examples o...

1,690 views

2022-07-22

Replying to @Data Storyteller Here are two examples of data or target leakage. I bet people have other fun examples. #datascience #targetlea...

#data#dataleakage#datascience#errors#examples#look#machinelearning#model#people#take#targetleakage
Video

How Error Analysis is Part of Data Science and Machin...

1,686 views

2024-05-20

How Error Analysis is Part of Data Science and Machine Learning

#analysis#boosting#data#error#errors#learning#machine#models#part#science#scientists

Conformal prediction.

1,684 views

2024-10-04

Conformal prediction.

#data#interval#let#prediction#predictions#see

Fundamentals folks. A great example is the paper on p...

1,674 views

2024-05-27

Fundamentals folks. A great example is the paper on police misconduct. It highlights a lot of great data science practices (more than I coul...

#data#learning#machine#misconduct#model#police

Replying to @rajistics here are two themes I wanted t...

1,671 views

2022-12-18

Replying to @rajistics here are two themes I wanted to highlight. The second candidate showed more analytic maturity.

#algorithms#data#important#science#second#two
Video

Code not working? start with the documented examples ...

1,660 views

2022-09-03

Code not working? start with the documented examples #datascience#rstats #machinelearning #codetok #python

#better#code#codetok#cohere#datascience#like#llms#machinelearning#model#models#multi#python#rise#rstats#step#tools#use

Gen AI for Recommenders - Roundup of some recent rese...

1,651 views

2025-03-19

Gen AI for Recommenders - Roundup of some recent research from: Eugene Yan: https://eugeneyan.com/writing/recsys-llm/ Jean-Michel Daigna: ht...

#help#improve#information#search#use#used
Video

New state of the art embedding model, Instructor, for...

1,650 views

2023-01-22

New state of the art embedding model, Instructor, for text is available! It accounts for task and domain when creating an mending. #datascie...

#datascience#embeddings#huggingface#machinelearning#sentencetransformers#word2vec

ABCs of Generative AI: Anything But Chatbots -- There...

1,642 views

2024-07-13

ABCs of Generative AI: Anything But Chatbots -- There is so much value with generative AI, don't get trapped into just building chatbots. No...

#abcs#anything#build#chat#generative#rag
Video

Dtreeviz 2.0 - Visualizing Decision Trees

1,637 views

2022-12-28

Dtreeviz 2.0 - Visualizing Decision Trees

#datascience#decision#decisiontree#decisiontrees#dtreeviz#machinelearning#see#trees#visualizing

Pay attention to software licenses. Nothing like thro...

1,636 views

2025-01-16

Pay attention to software licenses. Nothing like throwing away work because people didn’t pay attention to the licenses. This happens all th...

#attention#away#licenses#like#make#pay#software

It works. #datascience #analytics #codetok #statistic...

1,625 views

2022-05-13

It works. #datascience #analytics #codetok #statistics #dataanalyst

#analytics#codetok#dataanalyst#datascience#genius#statistics

In this video, statistical modeling evolved from manu...

1,621 views

2025-02-19

In this video, statistical modeling evolved from manual processes requiring explicit data preparation (scaling, transformations, imputation)...

#approaches#automated#data#decision#llms#manual#models#modern#power#replace#services#specialized#statisticians#using

Model monitoring and population stability index

1,601 views

2025-06-04

Model monitoring and population stability index

#cake#change#index#population#stability#think

#onthisday

1,595 views

2025-04-21

#onthisday

#know#like#onthisday#said#time#years

Reply to @ereb0s_rl #datascience #analytics #techtok ...

1,594 views

2022-04-24

Reply to @ereb0s_rl #datascience #analytics #techtok #rstats #kaggle #fastair #machinelearning

#analytics#datascience#fastair#kaggle#rstats#techtok

AI that makes you feel better. The paper is Inducing ...

1,593 views

2022-10-14

AI that makes you feel better. The paper is Inducing Positive Perspectives with Text Reframing. You can find a demo over at 🤗 hugging face ...

#better#codetok#datascience#feel#machinelearning#model
Video

Video coming on Text Generational in Colab

1,583 views

2023-07-25

Video coming on Text Generational in Colab

#colab#coming#generational#text#video

Data data data.

1,573 views

2024-08-01

Data data data.

#collected#data#example#might#think#thinking

Recent research on model compression with quanitizati...

1,572 views

2024-10-21

Recent research on model compression with quanitization from Neural Magic, go check it out We Ran Over Half a Million Evaluations on Quantiz...

#half#magic#model#models#neural#squeeze

And even better when they submit an issue #datascienc...

1,572 views

2022-08-17

And even better when they submit an issue #datascience #codetok #opensource

#better#burn#codetok#datascience#even#opensource

Credit for the tips to Evan Peck, who pulls from a vi...

1,569 views

2025-03-24

Credit for the tips to Evan Peck, who pulls from a visualization by John Burn-Murdoch in the Financial Times: https://www.ft.com/content/73...

#don#labels#let#like#tips#visualizations

#duet with @Sylar2.5 #parodysong #datascience #codetok

1,561 views

2022-08-30

#duet with @Sylar2.5 #parodysong #datascience #codetok

#codetok#count#datascience#duet#parodysong#sylar2

Staying busy and doing a public talk on Generative AI...

1,541 views

2022-12-27

Staying busy and doing a public talk on Generative AI. It will be about 40 minutes so gives me chance to dive into more details and answer q...

#chatgpt#datascience#generativeai#machinelearning#staying#webinar
Video

Segment Anything (SAM) is a new segmentation model fr...

1,536 views

2023-04-06

Segment Anything (SAM) is a new segmentation model from Meta. It's a huge improvement over the state of the art and is going to change compu...

#computervision#datascience#imagesegmentation#machinelearning#meta#segmentanything

Replying to @joshhenny it’s great time to learning ab...

1,532 views

2022-11-14

Replying to @joshhenny it’s great time to learning about #largelanguagemodels or #stablediffusion #datascience #machinelearning

#datascience#going#largelanguagemodels#machinelearning#stablediffusion#time

Regression to the mean with the Madden Curse and Spor...

1,527 views

2022-01-20

Regression to the mean with the Madden Curse and Sports Illustrated Jinx #datascience #analytics #stats #maddencurse #sijinx #regression

#analytics#datascience#maddencurse#regression#sijinx#stats

Datasets have worldviews from Google PAIR, link in co...

1,519 views

2022-02-09

Datasets have worldviews from Google PAIR, link in comments, #datascience #bias #machinelearning #ethics #pair-google #statistics

#bias#datascience#ethics#machinelearning#pair#statistics

After simple baselines, Anomaly detection is hard.

1,517 views

2024-09-26

After simple baselines, Anomaly detection is hard.

#algorithm#algorithms#anomaly#better#data#detection

Investigate before modifying your data

1,496 views

2024-11-25

Investigate before modifying your data

#call#center#data#get#lot#missing

GDG DevFest Ukraine, sign up! #datascience #codetok ...

1,495 views

2022-06-11

GDG DevFest Ukraine, sign up! #datascience #codetok #huggingface #dallemini #bigscience #devfestforukraine #standwithukraine

#bigscience#codetok#dallemini#datascience#devfestforukraine#huggingface
Video

Population Stability Index for Monitoring Machine Lea...

1,481 views

2024-05-30

Population Stability Index for Monitoring Machine Learning Models

#cake#index#learning#machine#mlops#modelmonitoring#monitoring#population#populationstabilityindex#stability

I much prefer working through code examples than deco...

1,476 views

2022-09-20

I much prefer working through code examples than decoding equations. I can’t be the only one. #datascience #statistics

#code#datascience#much#prefer#statistics#working

How else can you work? #datascience #stackoverflow #c...

1,471 views

2022-07-29

How else can you work? #datascience #stackoverflow #codetok

#bottom#codetok#datascience#else#stackoverflow#work

Recipe for Word and Sentence Embeddings: Word2Vec and...

1,470 views

2024-10-11

Recipe for Word and Sentence Embeddings: Word2Vec and now Static Embeddings for word embeddings, and to get more context, use the sentence t...

#embeddings#funny#going#llama#million#one#planning#release#right#sentence#take#transformers#use#well#word#words

Facts. We need data. #datascience #statistics #analys...

1,462 views

2022-05-01

Facts. We need data. #datascience #statistics #analysis #techtok

#analysis#datascience#flow#statistics#techtok#time

Replying to @minisdlatvia my big tip for learning dat...

1,458 views

2022-07-16

Replying to @minisdlatvia my big tip for learning data science #datascience #machinelearning #analytics #codetok #webapps #gradio #streamlit...

#analytics#data#datascience#machinelearning#things#think

Curriculum Learning is about ordering your training d...

1,447 views

2024-07-27

Curriculum Learning is about ordering your training data. It's another useful technique that you should consider. Some background: Overview:...

#curriculum#data#images#learning#train#training

Level set expectations early! People have unrealisti...

1,447 views

2022-06-07

Level set expectations early! People have unrealistic views. #datascience #dataanalytics #statistics #codetok

#codetok#dataanalytics#datascience#ding#level#statistics

#insurance #regulation #datascience #statistics #inte...

1,447 views

2022-05-21

#insurance #regulation #datascience #statistics #interpretablemodels #codetok

#datascience#insurance#interpretablemodels#models#regulation#statistics

TikTok video #7445674328907123999

1,446 views

2024-12-07

TikTok video #7445674328907123999

#7445674328907123999#actually#answers#good#models#text

Reply to @zythesciguy Reply to @zythesciguy #datascie...

1,445 views

2022-05-29

Reply to @zythesciguy Reply to @zythesciguy #datascience #statistics #codetok

#codetok#datascience#feel#reply#statistics#zythesciguy
Video

Feature Selection Methods for Machine Learning, plus ...

1,430 views

2024-10-13

Feature Selection Methods for Machine Learning, plus Feature Selection Curves

#est#feature#fonctionnalit#learning#les#machine#methods#nous#plus#que#selection#vous
Video

Shapley Values in Machine Learning

1,424 views

2022-08-10

Shapley Values in Machine Learning

#learning#machine#shapley#values
Video

Machine Learning Visualizations: My Favorites

1,419 views

2024-05-05

Machine Learning Visualizations: My Favorites

#examples#favorites#fine#learning#machine#models#performance#prompting#tuning#visualizations

My blog post journey from Jekyll to Quarto. #bloggin...

1,419 views

2024-06-04

My blog post journey from Jekyll to Quarto. #blogging #rajistics #jekyll #quarto #posit

#blog#blogging#jekyll#posit#post#quarto

Seen this being hashed out on Twitter and had to join...

1,417 views

2022-08-26

Seen this being hashed out on Twitter and had to join #dataengineering #codetok #duckdb #spark #datascience

#codetok#dataengineering#datascience#duckdb#seen#spark

🚀 GPUs.

1,415 views

2024-09-20

🚀 GPUs.

#going#keep#less#like#models#things

Let’s dig in to the differences between baseline mode...

1,409 views

2025-04-08

Let’s dig in to the differences between baseline models and benchmark datasets.

#baseline#benchmark#data#model#models#simple#use

Mmm. Food classification.

1,409 views

2024-08-04

Mmm. Food classification.

#build#data#images#model#much#start

Its always longer than you want to get your data prep...

1,402 views

2022-08-07

Its always longer than you want to get your data prepped. #datascience #dataengineering #analytics #codetok

#always#analytics#codetok#dataengineering#datascience#longer

Keep your data science projects page loaded by making...

1,400 views

2022-06-03

Keep your data science projects page loaded by making open source versions of your work. #datascience #codetok #programming #python

#codetok#datascience#going#keep#programming#python
Video

Glitch Tokens in Large Language Models (SolidGoldMagi...

1,397 views

2024-05-12

Glitch Tokens in Large Language Models (SolidGoldMagikarp)

#bfloat16#datascience#floating#glitch#language#large#machinelearning#models#quantization#solidgoldmagikarp#tokens

Evaluation data is so so important when working on ma...

1,395 views

2024-08-09

Evaluation data is so so important when working on machine learning or generative AI projects. Labeling data is an imporant task and you can...

#data#evaluation#example#examples#important#labeling
Video

Transformer Explainer (1 min Short) - Interactive Vis...

1,393 views

2024-08-11

Transformer Explainer (1 min Short) - Interactive Visualization for Transformers

#end#explainer#interactive#min#short#tool#transformer#visualization

What a data scientist does #datascience #analytics #c...

1,393 views

2022-09-04

What a data scientist does #datascience #analytics #codetok #python

#analytics#codetok#data#datascience#python#scientist

Scaling LLMs. Two years ago OpenAI had a big lead. Si...

1,378 views

2025-01-07

Scaling LLMs. Two years ago OpenAI had a big lead. Since then, other companies have learned to effectively scale their model training.

#billion#going#laws#model#models#scaling

Clustering with kmeans

1,363 views

2024-08-22

Clustering with kmeans

#algorithm#clustering#data#people#start#works

Every once in a while, I go back and try to build som...

1,359 views

2025-11-09

Every once in a while, I go back and try to build some AI from the ground up. Lately, its been "Mixture of Experts" (MoE) models, and I foun...

#bin#data#experts#going#histograms#kind#like#plots#see#size#well

Generative AI is awesome. If you need more ideas chec...

1,356 views

2024-07-25

Generative AI is awesome. If you need more ideas check out all 35 real world examples for using Generative AI / LLMs that Evidently has put ...

#generative#models#together#use#uses#using

Great data scientists figure out the best questions c...

1,356 views

2022-11-05

Great data scientists figure out the best questions come from talking to people. #datascience book is practical python and opencv by rosebr...

#best#data#datascience#figure#great#scientists

Agents in AI Some effective examples for using agents...

1,349 views

2025-01-04

Agents in AI Some effective examples for using agents based on posts from Anthropic and Hugging Face. Hugging Face - SmolAgents: https://hug...

#agents#anthropic#effective#face#hugging#smolagents

Machibr learning tradeoffs between explainabilty and ...

1,345 views

2024-08-06

Machibr learning tradeoffs between explainabilty and accuracy.

#bathrooms#learning#model#price#sales#understand
Video

Baseline Models and Benchmark Datasets Explained

1,331 views

2023-04-09

Baseline Models and Benchmark Datasets Explained

#baseline#baselinemodel#benchmark#datascience#datasets#explained#machinelearning#model#models
Video

Evaluation for Generative AI - A simply explained sta...

1,308 views

2025-05-17

Evaluation for Generative AI - A simply explained starting point

#evaluation#explained#gen#generative#going#kind#like#lot#point#simply#starting

Older video and Cleanlab has changed its license. But...

1,303 views

2025-01-30

Older video and Cleanlab has changed its license. But it’s still a cool tool to be aware of.

#changed#cleanlab#data#errors#license#look#model#older#see#set#video#website
Video

LLama 3 effects on the AI / LLM Startups Like OpenAI

1,293 views

2024-04-21

LLama 3 effects on the AI / LLM Startups Like OpenAI

#effects#going#like#llama#llama3#llm#meta#open#openai#startups

Your data science 101 reminder when working with clas...

1,289 views

2022-10-10

Your data science 101 reminder when working with classification models. #datascience #statistics #codetok

#codetok#data#datascience#science#statistics#teddy

What tools are way too hard to use? #datascience #st...

1,285 views

2022-03-05

What tools are way too hard to use? #datascience #statistics #analytics

#analytics#datascience#hard#statistics#tools#way

This fits so well for agents. Timeless.

1,282 views

2025-02-05

This fits so well for agents. Timeless.

#chat#get#got#gpt#hey#know

Filling in those job duties 🚩🚩 #datascience #codetok

1,275 views

2022-08-23

Filling in those job duties 🚩🚩 #datascience #codetok

#codetok#datascience#duties#filling#job#music

#onthisday

1,273 views

2024-12-05

#onthisday

#data#going#let#onthisday#time#want
Video

Creative Coding with Nature of Code from Coding Train...

1,271 views

2024-09-08

Creative Coding with Nature of Code from Coding Train's Daniel Shiffman

#code#coding#creative#creator#daniel#great#hero#learn#nature#new#released#train

ChatGPT has sucked up a lot of my attention. Will do ...

1,265 views

2022-12-07

ChatGPT has sucked up a lot of my attention. Will do a post soon on how it works.

#attention#chatgpt#lot#post#soon#sucked

I was rocking with SPSS back in 2009. I didn’t start...

1,263 views

2022-03-20

I was rocking with SPSS back in 2009. I didn’t start using R until a few years later. We had to pay $$$ for a basic regression. #datascien...

#back#datascience#didn#rocking#spss#statistics

Technical AI systems are vulnerable to out-of-distrib...

1,260 views

2025-02-20

Technical AI systems are vulnerable to out-of-distribution behaviors, as shown through evasion techniques like unusual movement patterns and...

#beat#computer#like#patterns#strategies#system#systems#technical#training

A deeper dive into Latent Space. Get a sense of how w...

1,257 views

2024-09-10

A deeper dive into Latent Space. Get a sense of how we compress these models and then uncompress them to unlock creativity.

#deeper#dive#get#information#latent#let#numbers#see#sense#space

Reminder to be smart about how you using your trainin...

1,254 views

2022-12-13

Reminder to be smart about how you using your training data. #machinelearning #datacentricai #datascience #waymo #reinforcementlearning

#data#datacentricai#machinelearning#model#runs#waymo

Love stats 🤣

1,229 views

2025-12-21

Love stats 🤣

#favorite#happen#love#right#stats#thing#think
Video

Should you take the time to learn Kubernetes as a dat...

1,215 views

2023-01-23

Should you take the time to learn Kubernetes as a data scientist? Or you already overloaded learning data science? #datascience #machinelear...

#data#datascience#don#kubernetes#learn#machinelearning#take#time

Test yourself against AI for both next word predictio...

1,212 views

2024-11-16

Test yourself against AI for both next word prediction and overall knowledge: Are you smarter than a language model? - https://joel.tools/sm...

#get#gpt4#language#large#like#model#next#smarter#word

Offering ways to improve your machine learning models...

1,211 views

2022-07-23

Offering ways to improve your machine learning models #huggingface #datascience #codetok #datacentricai #adversarial

#adversarial#better#codetok#datascience#huggingface#model
Video

Tracking Covid - #datascience #analytics #wastewater ...

1,206 views

2022-01-27

Tracking Covid - #datascience #analytics #wastewater #covid19 #data #monitoring

#analytics#covid19#data#datascience#monitoring#wastewater

Have some projects in your github #datascience #githu...

1,204 views

2022-08-03

Have some projects in your github #datascience #github #codetok

#codetok#datascience#github#hum#projects

What presentation style works for you?

1,202 views

2024-12-23

What presentation style works for you?

#analysis#campaigns#going#marketing#shows#talk

Let’s be practical about using AI. Here we recognize ...

1,186 views

2025-05-07

Let’s be practical about using AI. Here we recognize that hallucinations are a legitimate concern, but lets rank that against other concerns...

#concerns#hallucinations#like#model#models#percent#using

Probe the data #dataanalysis #datascience #statistics...

1,164 views

2022-08-01

Probe the data #dataanalysis #datascience #statistics #bias

#bias#collected#data#dataanalysis#datascience#statistics

Who would have known that my pronunciation of LaTeX w...

1,137 views

2022-10-15

Who would have known that my pronunciation of LaTeX would be such a big deal and so divisive. It’s all good. Go listen for yourself. @rajist...

#don#know#known#latex#pronunciation#would
Video

Having some fun connecting a spreadsheet to a ML mode...

1,136 views

2022-11-04

Having some fun connecting a spreadsheet to a ML model. It wasn’t too hard and it’s pretty cool to have it working this way. #datascienc...

#connecting#datascience#fun#huggingface#machinelearning#spreadsheet

Creating beautiful plots of data maps. DataMapPlot is...

1,130 views

2025-01-09

Creating beautiful plots of data maps. DataMapPlot is a small library designed to help you make beautiful data map plots for inclusion in pr...

#able#beautiful#data#get#like#plots#use

Replying to @chairstaple so many good distance metric...

1,130 views

2022-10-02

Replying to @chairstaple so many good distance metrics - what’s yours? This video covers Hamming, Levenshtein, Euclidean, Manhattan, and Ma...

#cleaning#data#distance#measure#often#ways

Evaluation of Large Language Models is a critical top...

1,124 views

2024-08-29

Evaluation of Large Language Models is a critical topic. Leaderboards provide little guidance for evaluation but have many flaws. I posted t...

#evaluatingllms#largelanguagemodels#leaderboards#models#paper#people

Catch me on the Practically intelligent podcase eposi...

1,115 views

2024-05-29

Catch me on the Practically intelligent podcase eposide 13. I talk about open source, enterprise AI, and the crazinees of the LLM space.

#large#like#open#source#talk#world

Getting the best distance metric is crucial for solvi...

1,108 views

2024-10-06

Getting the best distance metric is crucial for solving analytical problems. This video reviews Euclidean, Manhattan, Mahabolobis, Levenshte...

#data#datascience#distance#distancemetrics#machinelearning#measure
Video

üöÄ Just get started on your journey to learn large ...

1,097 views

2023-07-26

🚀 Just get started on your journey to learn large language models! 🤔 Is there a lot to learn? Yes! 😅 🤷‍♂️ But is it easy t...

#ai#datascience#dollar#get#largelanguagemodels#llama2#machinelearning#million#netflix#pepsiapplepiechallenge#started

Some hints on how to evaluate Github projects.

1,081 views

2024-08-28

Some hints on how to evaluate Github projects.

#github#issues#like#look#project#projects

New style of content, let me know if you want more li...

1,074 views

2022-11-13

New style of content, let me know if you want more like this. Predict sentiment #machinelearning #datascience #transformers #huggingface

#datascience#huggingface#machinelearning#new#style#transformers

Long video in comments, #huggingface #datascience #re...

1,065 views

2022-07-15

Long video in comments, #huggingface #datascience #reinforcementlearning #deeplearning #codetok #mltok Earlier weeks: @Rajiv Shah @Rajiv Sha...

#codetok#datascience#deeplearning#huggingface#mltok#reinforcementlearning
Video

Multi Agent Systems Introduction

1,063 views

2025-06-18

Multi Agent Systems Introduction

#agent#anthropic#going#introduction#multi#research#systems#tasks

My favorite was a training on how to use zoom #securi...

1,037 views

2022-05-24

My favorite was a training on how to use zoom #securitytraining #codetok

#codetok#favorite#securitytraining#training#use#zoom
Video

Quick introduction to optimization and for advanced f...

1,031 views

2022-12-24

Quick introduction to optimization and for advanced folks, go run a notebook from gurobi or do the Kaggle Santa challenge. #datascience #mac...

#better#chatgpt#datascience#google#gurobi#machinelearning#model#models#optimization#quick#travelingsalesmanproblem

AI models brag about their “perfect” specs — until th...

1,030 views

2025-10-25

AI models brag about their “perfect” specs — until the test starts. Same rules. Same data. Totally different characters. Inspired by Anthrop...

#harm#model#models#rules#say#specs
Video

7 Baseline Predictive Models for Anomaly, Search, Tim...

1,029 views

2024-06-22

7 Baseline Predictive Models for Anomaly, Search, Time Series and More

#annoy#anomaly#baseline#data#embeddings#models#numbers#predictive#scientists#search#spotify#time

Latest from Google on advanced reasoning. Gemini 2.0 ...

1,027 views

2024-12-19

Latest from Google on advanced reasoning. Gemini 2.0 Flash Thinking

#active#advanced#age#flash#gemini#google#latest#new#radioactive#reasoning#waking#welcome
Video

AI only knows what's it's trained on. So beat it by d...

1,025 views

2023-02-21

AI only knows what's it's trained on. So beat it by doing something new. The video shows recent examples of marines beating a surveillance s...

#beat#datadrift#datascience#knows#machinelearning#modelmonitoring#system#trained

Netflix $1 million dollar prize #datascience #ai #net...

1,024 views

2022-01-09

Netflix $1 million dollar prize #datascience #ai #netflix #PepsiApplePieChallenge

#ai#datascience#dollars#million#netflix#pepsiapplepiechallenge

Reasoning and planning are key weaknesses of LLMs. Th...

1,023 views

2024-08-13

Reasoning and planning are key weaknesses of LLMs. This video was from last year, but the issue still remains. My guess is we will see addit...

#gpt#planning#really#reasoning#see#test

Wow. Look at that subscription revenue.

1,017 views

2024-07-12

Wow. Look at that subscription revenue.

#check#come#look#revenue#subscription#wow

Comparing algorithms spiral dataset, #datascience #ma...

1,014 views

2022-01-24

Comparing algorithms spiral dataset, #datascience #machinelearning #algorithms #gbm #logisticregression

#algorithms#comparing#datascience#gbm#logisticregression#machinelearning

Beer and diapers story of association of products. #d...

1,012 views

2022-02-18

Beer and diapers story of association of products. #datascience #recommendationsystems #marketing #analytics #correlation

#analytics#beer#correlation#datascience#marketing#recommendationsystems

Still making PowerPoint-2008-level charts? AI gives y...

1,010 views

2025-12-01

Still making PowerPoint-2008-level charts? AI gives you a shortcut to stunning visuals and real design skills. Pick the level that fits wher...

#canva#design#gives#google#level#making#really#slides#still#take#want#whether
Video

CLIP Interrogator is available over at the hugging fa...

1,010 views

2022-10-25

CLIP Interrogator is available over at the hugging face spaces. Have fun! #datascience #machinelearning #stablediffusion #huggingface

#clip#datascience#huggingface#interrogator#machinelearning#revolution#stablediffusion#statistics#tabpfn#time

Looking forward to a lot more videos in 2023, let me ...

1,005 views

2023-01-01

Looking forward to a lot more videos in 2023, let me know topics I should cover. For all my videos, I put them in an airtable spreadsheet av...

#alright#baby#feeling#know#looking#videos

Predicting NCAA basketball #marchmadness #datascience...

1,004 views

2022-03-25

Predicting NCAA basketball #marchmadness #datascience #sportsanalytics #illinois

#bracket#datascience#illinois#marchmadness#sportsanalytics#team

#onthisday

993 views

2024-07-24

#onthisday

#onthisday#praying#regina#spent#talking#time

It’s tough to be content #codetok #techtok #datascien...

987 views

2022-05-10

It’s tough to be content #codetok #techtok #datascience #programming

#codetok#content#datascience#programming#techtok#tough
Video

Sparsity in AI (data frames, dropout, regularization,...

985 views

2025-03-09

Sparsity in AI (data frames, dropout, regularization, and Mixture of Experts)

#data#datascience#dropout#frames#important#machinelearning#make#mixture#regularization#sparsity#statistics#sure

What’s the deal with those competition rules #datasci...

968 views

2022-08-28

What’s the deal with those competition rules #datascience #codetok #analytics #kaggle

#analytics#codetok#competition#datascience#deal#kaggle

Creating music videos with stable diffusion and whisp...

966 views

2022-09-28

Creating music videos with stable diffusion and whisper. This colab notebook uses a dream studio backend for the images. Another great step ...

#analytics#datascience#diffusion#music#stable#stablediffusion
Video

ChatGPT for Robotics is the latest hot paper. Large l...

961 views

2023-02-22

ChatGPT for Robotics is the latest hot paper. Large language models are the future interface. #datascience #machinelearning #largelanguagemo...

#chatgpt#datascience#largelanguagemodels#machinelearning#microsoft#robotics

News flash: Data scientists spend lots of time on dat...

947 views

2022-07-24

News flash: Data scientists spend lots of time on data prep/exploration #datascience #dataengineering #analytics #codetok

#analytics#codetok#data#dataengineering#datascience#time
Video

Vicuna is awesome go check it out. Its the latest LLa...

928 views

2023-03-30

Vicuna is awesome go check it out. Its the latest LLama model and very impressive. I ended up cutting out the details on vicuna since i feel...

#datascience#gpt3#largelanguagemodels#llama#machinelearning#openai#temperature#vicuna#want

Reply to @anthonycomputer Dive in and start! Lots of...

899 views

2022-04-20

Reply to @anthonycomputer Dive in and start! Lots of great stuff out there. #datascience #techtok #analytics

#analytics#datascience#one#right#techtok#things

Great way to get under the skin of your data scientis...

897 views

2022-10-08

Great way to get under the skin of your data scientist. #datascience #analytics #codetok

#analytics#codetok#datascience#get#great#way

I am awful about writing tests. This is why I don’t w...

896 views

2022-04-01

I am awful about writing tests. This is why I don’t write production code. #datascience #cstok #programminghumor #codetok

#body#codetok#cstok#datascience#mind#programminghumor

Simple tip, never claim causation. Unless you have an...

886 views

2022-11-05

Simple tip, never claim causation. Unless you have an experimental design, it’s hard to prove. #datascience #machinelearning #statistics

#datascience#god#machinelearning#simple#statistics#tip
Video

Temperature is an important parameter when working wi...

881 views

2023-03-21

Temperature is an important parameter when working with many models including got-3. This video gives a short background on temperature and ...

#datascience#gpt3#gpt4#largelanguagemodels#lora#machinelearning#model#peft#temperature
Video

X-decoder from Microsoft. Check out the instructional...

879 views

2023-02-16

X-decoder from Microsoft. Check out the instructional text demo. I added in video released by the team at the bottom. If too many people don...

#datascience#decoder#machinelearning#model#pix2pix#text#video#x

From Spiegelhalter interview on Artists of Data Scien...

867 views

2022-05-23

From Spiegelhalter interview on Artists of Data Science podcast #datascience #statistics #codetok #dataanalysis

#codetok#dataanalysis#datascience#interview#spiegelhalter#statistics

Practical Lessons for building generative AI: I share...

863 views

2024-10-02

Practical Lessons for building generative AI: I share the latest research and earned wisdom on building generative AI applications and touch...

#data#going#know#like#model#models

I have lived this. #conwayslaw #softwaredevelopment #...

846 views

2022-05-17

I have lived this. #conwayslaw #softwaredevelopment #codetok #programming

#codetok#conwayslaw#day#every#programming#softwaredevelopment

Rerunning your old code #datascience #techtok #progra...

843 views

2022-04-12

Rerunning your old code #datascience #techtok #programming #analytics

#analytics#datascience#old#programming#rerunning#techtok

Data science work is hard to schedule and plan. It co...

843 views

2022-01-26

Data science work is hard to schedule and plan. It conflicts with agile methods. #datascience #machinelearning #dataanalytics #agile #scrumm...

#agile#dataanalytics#datascience#long#machinelearning#scrummaster

Use these tips!

827 views

2024-08-02

Use these tips!

#data#got#know#new#really#use

I have no desire to build data infrastructure. I will...

802 views

2022-12-02

I have no desire to build data infrastructure. I will leave that to my #dataengineer friends. #datascience

#build#dataengineer#datascience#desire#get#gotta
Video

Climax, a new transformer based model for predicting ...

799 views

2023-02-07

Climax, a new transformer based model for predicting weather and climate forecasting. Great example of the flexibility of transformers based...

#based#climatemodel#climax#datascience#machinelearning#models#transformers
Video

Corporate research labs have changed academic work wi...

798 views

2023-01-28

Corporate research labs have changed academic work with their reluctance to provide reproducible research and getting around blind peer revi...

#corporate#datascience#hey#machinelearning#neurips#reproducibility#research#review

#aifilter #aifilterchallenge had to try it out and go...

795 views

2022-12-17

#aifilter #aifilterchallenge had to try it out and got a bit more buff

#aifilter#aifilterchallenge#bit#buff#got#try
Video

FrugalGPT: Tips for saving money, processing time, an...

792 views

2024-06-08

FrugalGPT: Tips for saving money, processing time, and improving speed with LLMs

#frugalgpt#great#llm#model#money#processing#saving#time#tips#use

Dreaded git push error. Had a little help tonight. #...

780 views

2022-07-23

Dreaded git push error. Had a little help tonight. #git #datascience #python

#datascience#dreaded#error#git#push#python
Video

Roundup of all the big headlines, hope this is fun fo...

777 views

2023-03-03

Roundup of all the big headlines, hope this is fun for you all. I laugh while making these, but wonder how many of you get all the refeenenc...

#apple#datascience#google#machinelearning#meta#openai#stabilityai

Context engineering works, until it doesn’t. Recursiv...

755 views

2026-01-04

Context engineering works, until it doesn’t. Recursive Language Models ask a sharper question: why are humans managing memory, search, and p...

#context#engineering#language#long#model#models
Video

ChatDoctor is a great example of fine tuning a large ...

749 views

2023-03-22

ChatDoctor is a great example of fine tuning a large language model to get more factually correct output. This is an approach i expect many ...

#chatdoctor#chatgpt#datascience#finetuning#largelanguagemodels#machinelearning#model
Video

Better LLMs (or not) including BloombergGPT, Databric...

742 views

2024-04-12

Better LLMs (or not) including BloombergGPT, Databricks, NVIDIA, Amazon, and Falcon -

#better#bloomberggpt#chatgpt#databricks#including#like#llms#model#nvidia#openai#rlhf#using
Video

It’s exasperating. #techtok #datascience #programming

731 views

2022-04-22

It’s exasperating. #techtok #datascience #programming

#datascience#exasperating#programming#techtok

New to Unix or Bash? This is a fast, visual walkthrou...

721 views

2026-01-04

New to Unix or Bash? This is a fast, visual walkthrough of the core terminal commands every beginner should know: where you are, how to move...

#file#new#running#see#unix#want

Being above average part II. Cite in comments. @raji...

710 views

2022-02-13

Being above average part II. Cite in comments. @rajistics #statistics #regressiontothemean #aboveaverage

#aboveaverage#average#less#likely#regressiontothemean#statistics
Video

Practical Lessons in Building Generative AI: RAG and ...

705 views

2024-09-27

Practical Lessons in Building Generative AI: RAG and Text to SQL

#building#frosty#generative#gpt#lessons#model#practical#rag#text
Video

Start Collecting Evaluation Data for Machine Learning...

694 views

2024-08-09

Start Collecting Evaluation Data for Machine Learning Projects

#app#collecting#data#evaluation#important#learning#machine#start#working
Video

Tuskegee Airman by geo karamanis links to code in com...

674 views

2022-02-14

Tuskegee Airman by geo karamanis links to code in comments #TidyTuesday #rstats #datascience #datavisualization

#airman#datascience#datavisualization#rstats#tidytuesday#tuskegee

self driving cars and data quality - LOA - #datascien...

674 views

2022-01-25

self driving cars and data quality - LOA - #datascience #machinelearning #selfdrivingcar #stanford #data #stats

#data#datascience#machinelearning#selfdrivingcar#stanford#stats
Video

Optimization using the Python Optimal Transport Package

670 views

2023-04-24

Optimization using the Python Optimal Transport Package

#computervision#datascience#diffusion#machinelearning#objectdetection#optimal#optimization#package#python#transport#using
Video

Optimal punt return #datascience #nfl

668 views

2022-01-10

Optimal punt return #datascience #nfl

#datascience#nfl#optimal#punt#return

Those pesky outliers.

666 views

2022-10-20

Those pesky outliers.

#dealt#find#outliers#pesky#soon#trust
Video

My second try to explain in context learning or few s...

660 views

2023-01-27

My second try to explain in context learning or few shot learning with large language models. It’s very cool and why these models are so e...

#datascience#fewshotlearning#gpt3#incontextlearning#largelanguagemodels#machinelearning#models

One of my favorites for #explainability #datascience ...

655 views

2022-06-28

One of my favorites for #explainability #datascience #statistics #interpretability #codetok #python #machinelearning

#codetok#datascience#explainability#interpretability#python#statistics

Some things are bigger than data science, I have a pe...

637 views

2022-02-25

Some things are bigger than data science, I have a personal connection here and have to express my support. #ukraine #priceoffreedom #datasc...

#bigger#data#datascience#priceoffreedom#things#ukraine
Video

Interpretable Machine Learning Models Simply Explaine...

632 views

2024-08-10

Interpretable Machine Learning Models Simply Explained - Rulefit, GA2M, Rule Lists, and Scorecard

#ability#accuracy#explain#explained#interpretable#learning#machine#models#simply#tradeoff
Video

How computers think: Classification and Regression. #...

632 views

2022-01-11

How computers think: Classification and Regression. #datascience #machinelearning #fruitbowl

#classification#computers#datascience#fruitbowl#machinelearning#think

Reply to @noleli median versus mean

622 views

2022-02-11

Reply to @noleli median versus mean

#average#half#income#mean#median#people

Analysis never ends.

620 views

2022-10-23

Analysis never ends.

#analysis#ends#never#problem

The damage I have done with root access. What have yo...

611 views

2022-09-09

The damage I have done with root access. What have you done? #codetok #python

#codetok#damage#done#lose#python#root
Video

Pair programming is some of my favorite times as a da...

609 views

2023-03-19

Pair programming is some of my favorite times as a data scientist. I am starting to use ChatGPT to fill that role lately. Its useful for me....

#chatgpt#codex#datascience#machinelearning#pair#pairprogramming#way
Video

ChatGPT for Robotics is the latest hot paper. Large l...

587 views

2023-02-22

ChatGPT for Robotics is the latest hot paper. Large language models are the future interface. #datascience #machinelearning #largelanguagemo...

#chatgpt#customer#customerlifetimevalue#datascience#largelanguagemodels#machinelearning#marketinganalytics#microsoft#rfm#robotics
Video

Funny stuff, not created by me - #datascience #codeto...

584 views

2022-08-03

Funny stuff, not created by me - #datascience #codetok #deeplearning

#codetok#created#data#datascience#deeplearning#finetuning#funny#language#largelanguagemodels#model#norwegian#rutergpt#stuff#vasamuseet
Video

Scandals in AI: Objaverse, Llama, Alpaca, and Dolly

581 views

2023-03-27

Scandals in AI: Objaverse, Llama, Alpaca, and Dolly

#alpaca#data#dolly#largelanguagemodels#llama#meta#objaverse#scandals
Video

Point-E from #openai. Generating 3D point clouds from...

581 views

2022-12-20

Point-E from #openai. Generating 3D point clouds from text #datascience #machinelearning

#chatgpt#clouds#datascience#generating#machinelearning#openai#point#retrievalaugmentedmodel#youchat
Video

Our fellow algorithms calling mom featuring our linea...

581 views

2022-10-21

Our fellow algorithms calling mom featuring our linear model, XGBoost, and Neural Networks. I had fun making them.

#algorithms#calling#craiyon#dallemini#datascience#featuring#fellow#linear#machinelearning#mom#stablediffusion#texttoimage

False positive and false negative #datascience #stati...

575 views

2022-02-06

False positive and false negative #datascience #statistics #decionmaking #classificationalgorithm #algorithm

#aim#algorithm#classificationalgorithm#datascience#decionmaking#statistics
Video

Scaling laws help us figure out how manage the amount...

563 views

2023-01-07

Scaling laws help us figure out how manage the amount of training data versus the model size. DeepMind showed with Chinchilla by using more ...

#blip#datascience#deepmind#git#image#largelanguagemodels#machinelearning#microsoft#model#nvidia#openai#see
Video

Deciding whether to use a Large Language Model or a s...

549 views

2023-06-02

Deciding whether to use a Large Language Model or a smaller model? This video explores the tradeoffs between both approaches based on the la...

#datascience#deciding#largelanguagemodels#machinelearning#model#smaller
Video

We all play roulette with stackoverflow. #programming...

538 views

2022-03-24

We all play roulette with stackoverflow. #programming #datascience #python

#datascience#play#programming#python#roulette#stackoverflow
Video

It’s almost here. Full support for pandas in sklear...

534 views

2022-10-18

It’s almost here. Full support for pandas in sklearn pipelines. #machinelearning #datascience #codetok #python #sklearn #sci-kit

#baseline#codetok#datascience#machinelearning#model#python#sci#sklearn#statistics#timeseries

Don’t feel bad if you havent put a machine learning m...

533 views

2022-08-11

Don’t feel bad if you havent put a machine learning model into production. Lots of valuable data scientist haven’t done fhat.

#bad#don#feel#havent#machine#put
Video

Selecting and Speeding up your Sentence Transformer M...

528 views

2024-10-15

Selecting and Speeding up your Sentence Transformer Models

#embeddings#models#selecting#sentence#speeding#static#transformer#use#word#word2vec

Machine learning engineer growing career #machinelear...

520 views

2022-01-29

Machine learning engineer growing career #machinelearning #datascience #dataengineering #programming #ai #career #programmingbootcamp #stati...

#career#dataengineering#datascience#machinelearning#programming#programmingbootcamp
Video

Speculating on GPT-4 size and performance. #datascien...

516 views

2023-02-20

Speculating on GPT-4 size and performance. #datascience #machinelearning #gpt3 #gpt4

#datascience#gpt#gpt3#gpt4#machinelearning#size#speculating
Video

GPT4 hype that it will be 100 trillion parameters. Th...

514 views

2023-01-16

GPT4 hype that it will be 100 trillion parameters. This doesn’t make any sense. See the video on scaling laws @rajistics and think about t...

#datascience#gpt4#hype#machinelearning#openai#trillion
Video

Anthropic is starting to preview their model and peop...

508 views

2023-01-08

Anthropic is starting to preview their model and people are comparing it to ChatGPT. Thanks to Riley Goodside for sharing screenshots. It lo...

#anthropic#big#bigdatabowl#chatgpt#claude#data#datascience#largelanguagemodels#machinelearning#nfl#statistics

At least it will be faster to build the second time. ...

501 views

2022-09-29

At least it will be faster to build the second time. Ugh. How often have you had to recode something?

#build#faster#least#second#time#ugh
Video

So many hyperparameters - this is from pytorch foreca...

495 views

2022-01-30

So many hyperparameters - this is from pytorch forecasting #datascience #machinelearning #hyperparameters #coding #algorithms #modeling

#algorithms#coding#datascience#hyperparameters#machinelearning#modeling
Video

Getting explainability when working with transformer ...

490 views

2022-10-19

Getting explainability when working with transformer based image or vision models. Uses Captum on the backend, but makes it easy to get imag...

#captum#computervision#datascience#explainability#huggingface#machinelearning

Talk business not data science metrics to have a busi...

490 views

2022-02-16

Talk business not data science metrics to have a business impact #datascience #machinelearning #statistics #analytics

#analytics#business#datascience#machinelearning#running#statistics
Video

Hallucination-Free? Assessing the Reliability ofLeadi...

477 views

2024-06-01

Hallucination-Free? Assessing the Reliability ofLeading AI Legal Research Tool

#assessing#companies#forrester#free#hallucination#legal#like#make#marketing#ofleading#reliability#research
Video

Reinforcement learning with my Eat Melon! Demo based ...

476 views

2022-04-05

Reinforcement learning with my Eat Melon! Demo based on Karpathy #datascience #reinforcementlearning #techtok #machinelearning

#datascience#learning#machinelearning#reinforcement#reinforcementlearning#techtok
Video

AI Literacy, Question 1, can AI think by itself? #ai ...

471 views

2022-01-16

AI Literacy, Question 1, can AI think by itself? #ai #datascience #programming #counsciousness #alleninstitute #literacy #capcut

#alleninstitute#capcut#counsciousness#datascience#literacy#programming
Video

Speed up XGBoost using Hist split method (faster than...

465 views

2024-10-27

Speed up XGBoost using Hist split method (faster than Exact, and Approx)

#actually#disease#hist#let#look#method#model#regression#speed#split#using#xgboost
Video

Replying to @rajistics as promised, the feature or va...

435 views

2023-02-12

Replying to @rajistics as promised, the feature or variables in auto insurance models. Keep the feedback coming. #datascience #machinelearni...

#acturialscience#autoinsurance#datascience#datavisualization#great#histogram#histograms#insurance#machinelearning#replying#statistics
Video

Using Logit Bias in Large Language Models

428 views

2023-11-11

Using Logit Bias in Large Language Models

#bias#datascience#language#large#like#logit#machinelearning#microsoft#models#rust#using

How I added my TikTok, instagram, and YouTube videos ...

426 views

2024-04-02

How I added my TikTok, instagram, and YouTube videos to my website. I used Buzzlytics to gather the information, python to munge all the da...

#buzzlytics#chatgpt#rajivshah#vercel#videos#website
Video

Text Similarity Techniques: Lexical, Semantic, and Ha...

397 views

2024-09-29

Text Similarity Techniques: Lexical, Semantic, and Hashing

#des#hashing#les#lexical#mais#pour#qui#semantic#similarity#techniques#text#vous

TikTok video #7141035352076094762

388 views

2022-09-08

TikTok video #7141035352076094762

#7141035352076094762#ask#chance#don#video
Video

Choosing Transformer, Word2Vec, or a Sentence Transfo...

382 views

2024-06-13

Choosing Transformer, Word2Vec, or a Sentence Transformer

#choosing#even#get#like#sentence#transformer#transformers#word2vec
Video

How companies use your data for training models will ...

356 views

2023-01-19

How companies use your data for training models will be a big issue this year. GitHub is being sued for Copilot and Hugging Face has been bu...

#bigcode#code#companies#copilot#data#datascience#github#huggingface
Video

Transformer Explainer (Full Video) - Interactive Visu...

353 views

2024-08-11

Transformer Explainer (Full Video) - Interactive Visualization for Transformers

#audio#explainer#face#full#hugging#interactive#speech#speecht5#text#transformer#video#visualization
Video

Don't let people overlook open source software. It mi...

349 views

2024-01-19

Don't let people overlook open source software. It might be free but it's priceless. The Value of Open Source Software at https://papers.ssr...

#code#don#open#opensource#papers#software#source#value
Video

ColPali: Bringing Vision Language Models to Document ...

346 views

2024-10-10

ColPali: Bringing Vision Language Models to Document Retrieval

#also#bringing#colpali#die#document#language#models#und#vision#wir
Video

ChatGPT price drop. Let’s break down how much the p...

342 views

2023-03-02

ChatGPT price drop. Let’s break down how much the price dropped, how OpenAI could drop the price, the effects on performance, what is goin...

#anthropic#chatgpt#cohere#datascience#langchain#machinelearning#openai
Video

State of Generative AI in 2024 and How it is Falling ...

337 views

2024-07-01

State of Generative AI in 2024 and How it is Falling Short

#bigcode#evaluation#falling#functionalcorrectness#generative#huggingface#short#state#unit#unittests
Video

Retrieval Secrets from the #1 Solution in the Kaggle ...

333 views

2025-01-03

Retrieval Secrets from the #1 Solution in the Kaggle Eedi Math Competition

#1#data#eedi#going#kaggle#retrieval#retrievers#secrets#solution#synthetic#use#used

I hope this pain isn’t shared widely #techtok #powerp...

332 views

2022-04-12

I hope this pain isn’t shared widely #techtok #powerpoint #datascientist

#datascientist#hope#isn#pain#powerpoint#techtok
Video

Big data bowl submissions are going in and lots of gr...

330 views

2023-01-13

Big data bowl submissions are going in and lots of great sports analytic work. This one is on strain for evaluating pass rushers. #datascien...

#big#bigdatabowl#data#datascience#defensive#ends#football#nfl#pass#physics#statistics#understand
Video

Explanations in Machine Learning

314 views

2022-08-10

Explanations in Machine Learning

#explanations#learning#machine
Video

I still havent tried copilot. Have you? #datascience ...

309 views

2022-07-27

I still havent tried copilot. Have you? #datascience #codetok #codex #copilot #python

#codetok#codex#copilot#datascience#flant5#gpt3#largelanguagemodels#machinelearning#python#reasoningwithpeople#still

Logical song full explanation here: @rajistics #sij...

308 views

2022-01-21

Logical song full explanation here: @rajistics #sijinx #maddencurse #stats #analytics #regression

#analytics#logical#maddencurse#regression#sijinx#stats
Video

Recognizing Meta AI's contribution

307 views

2023-05-14

Recognizing Meta AI's contribution

#contribution#meta#recognizing
Video

MedEmbed: Fine-Tuned Embedding Models for Medical / C...

298 views

2024-10-25

MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical

#das#die#eine#embedding#fine#ist#medembed#medical#models#tuned#und#wir
Video

The pain. Data munging on poorly prepped data. #datas...

297 views

2022-03-22

The pain. Data munging on poorly prepped data. #datascience #analytics #csv

#analytics#csv#data#datascience#munging#pain
Video

Feature Selection with Boruta, MRMR, and Recursive Fe...

292 views

2024-09-22

Feature Selection with Boruta, MRMR, and Recursive Feature Elimination

#boruta#elimination#explained#feature#many#methods#mrmr#recursive#selection
Video

Overdue for sports analytics #datascience #analytics ...

286 views

2022-08-12

Overdue for sports analytics #datascience #analytics #codetok #sportsanalytics #machinelearning

#analytics#codetok#datascience#machinelearning#overdue#sportsanalytics

Accuracy is not your friend for most problems #datasc...

280 views

2022-01-13

Accuracy is not your friend for most problems #datascience #media #norythm

#accuracy#datascience#friend#media#norythm#problems
Video

2022 plans #2022 #datascience #ai #machinelearning #s...

279 views

2022-01-19

2022 plans #2022 #datascience #ai #machinelearning #skillup start with AI Literacy @rajistics

#2022#ai#datascience#machinelearning#plans#skillup

i love notebooks #notebooks #programming #rstats

274 views

2022-01-12

i love notebooks #notebooks #programming #rstats

#cannot#cigarette#love#notebooks#programming#rstats
Video

How an Image Classifier Learns (using time lapse)

263 views

2017-04-18

How an Image Classifier Learns (using time lapse)

#cheating#classifier#datascience#image#kaggle#lapse#learns#machinelearning#ottocompetition#reared#time#using
Video

It’s important to make sure your model is well cali...

262 views

2022-11-11

It’s important to make sure your model is well calibrated. This becomes especially important with imbalanced data. #machinelearning #datas...

#data#datascience#important#machinelearning#make#much#need#statistics#sure
Video

Open Source with Stable Diffusion - #datascience #cod...

256 views

2022-08-27

Open Source with Stable Diffusion - #datascience #codetok #machinelearning #stablediffusion #opensourcesoftware

#codetok#datascience#machinelearning#open#opensourcesoftware#stablediffusion
Video

Using agents in langchain with gpt-3. You can do this...

245 views

2023-03-04

Using agents in langchain with gpt-3. You can do this! Go check it out. #datascience #machinelearning #openai #gpt3 #langchain

#answer#datascience#gpt3#langchain#machinelearning#openai
Video

Contrastive learning is common for folks working in N...

243 views

2022-11-03

Contrastive learning is common for folks working in NLP and images. This was new to me, so wanted to share the intuition a bit more widely. ...

#common#contrastive#folks#learning#nlp#working
Video

Implicit as an Implicit Recommender for Collaborative...

242 views

2024-05-25

Implicit as an Implicit Recommender for Collaborative Filtering

#collaborative#explicit#filtering#implicit#need#recommender
Video

Prompting versus Fine Tuning for Large Language Models

235 views

2024-05-07

Prompting versus Fine Tuning for Large Language Models

#fine#glitchtokens#language#large#largelanguagemodels#like#prompting#time#token#tokens#tuning#versus
Video

Meta’s less than open source model and some bad tak...

224 views

2023-03-05

Meta’s less than open source model and some bad takes from Twitter. #datascience #machinelearning #largelanguagemodels #opensource #meta

#datascience#largelanguagemodels#learning#less#machinelearning#meta#opensource#python#tools#use
Video

GPT-4o mini from Open AI: Performance, Cost, Competit...

223 views

2024-07-19

GPT-4o mini from Open AI: Performance, Cost, Competition, and Development of LLMs

#competition#cost#data#going#gpt#mini#model#models#open#performance#smarter#training
Video

Anthropic's research on Mapping the Mind of the Langu...

222 views

2024-05-24

Anthropic's research on Mapping the Mind of the Language Model

#anthropic#language#largelanguagemodels#like#mapping#mechanisticinterpretability#mind#model#research
Video

Reminder to visualize your data with one of my favori...

222 views

2022-10-29

Reminder to visualize your data with one of my favorites #anscombesquartet #datavisualization #datascience #statistics

#anscombesquartet#datascience#datavisualization#fun#huggingface#machinelearning#reminder#stablediffusion#statistics#told#visualize
Video

AI that makes you feel better. The paper is Inducing ...

208 views

2022-10-14

AI that makes you feel better. The paper is Inducing Positive Perspectives with Text Reframing. You can find a demo over at ü§ó hugging fac...

#better#codetok#datascience#feel#machinelearning#makes#model#positive#reframing
Video

Your weekly dose of LLM news. I liked this because it...

207 views

2022-11-06

Your weekly dose of LLM news. I liked this because it had interesting results with a smart approach. #datascience #machinelearning #largelan...

#datascience#dose#largelanguagemodels#let#llm#machinelearning#prediction#predictioninterval#predictions#statistics#weekly
Video

Interpretable models are often overlooked, but a grea...

203 views

2022-11-05

Interpretable models are often overlooked, but a great addition to your data science toolkit. Imodels is a great python package for getting ...

#datascience#great#imodels#interpretablemodels#machinelearning#statistics

What did i do this time? I hope your IT experienced ...

199 views

2022-07-13

What did i do this time? I hope your IT experienced go much better.

#better#experienced#hey#hope#much#time
Video

https://www.tiktok.com/@rajistics/video/7204141835965...

191 views

2023-02-25

https://www.tiktok.com/@rajistics/video/7204141835965500715

#video
Video

Interpretable models are often overlooked, but a grea...

182 views

2022-11-05

Interpretable models are often overlooked, but a great addition to your data science toolkit. Imodels is a great python package for getting ...

#datascience#flan#great#imodels#interpretablemodels#language#largelanguagemodels#machinelearning#model#statistics
Video

Random forests and their ease of use are important in...

181 views

2023-02-18

Random forests and their ease of use are important in understanding modern data science. #datascience #machinelearning #statistics #randomfo...

#data#datascience#fortran#machinelearning#randomforest#statistics
Video

Using AI for Pose Detection, this is such a cool appl...

171 views

2022-09-05

Using AI for Pose Detection, this is such a cool application. #datascience #deeplearning #codetok #posedetection #sportsanalytics

#codetok#datascience#deeplearning#posedetection#sportsanalytics#using
Video

Dreams of a better GPU #gpu #nvidia #deeplearning #ga...

170 views

2022-02-12

Dreams of a better GPU #gpu #nvidia #deeplearning #gaming #datascience

#datascience#deeplearning#dreams#gaming#gpu#nvidia

Pie chart fails #stats #datascience #datavisualizatio...

169 views

2022-01-15

Pie chart fails #stats #datascience #datavisualization #piechart #analytics #fails

#analytics#datascience#datavisualization#fails#piechart#stats
Video

Curriculum Learning in Machine Learning - Ordering Tr...

168 views

2024-07-27

Curriculum Learning in Machine Learning - Ordering Training Data Improves Performance

#curriculum#data#die#learning#machine#ordering#sie#snowflake#training#wie
Video

Rajistics on Gartner and Forrester research reports i...

159 views

2024-06-08

Rajistics on Gartner and Forrester research reports in AI

#approach#dspy#forrester#gartner#like#looks#prompting#reports#research#using
Video

How to Read Github (and find the Best Projects)

148 views

2024-08-29

How to Read Github (and find the Best Projects)

#best#evaluate#find#github#hints#projects#read
Video

Explaining how Emily Ocasio won second place with her...

145 views

2023-03-29

Explaining how Emily Ocasio won second place with her project analyzing media coverage. I like her approach and highlights a growing trend o...

#data#datascience#emily#emilyocasio#explaining#machinelearning#project#promptengineering#prompting#science#second#societyforscience
Video

Keras versus Pytorch Benchmarking Controversy

144 views

2024-04-06

Keras versus Pytorch Benchmarking Controversy

#benchmarking#benchmarks#compare#controversy#going#keras#python#pytorch#versus
Video

Limits of AI: Compute, Memory, and Interconnection

143 views

2024-09-06

Limits of AI: Compute, Memory, and Interconnection

#arxiv#compute#dram#interconnection#limits#memory#org#pdf#wall
Video

Embeddings, Context, and the Static Embeddings in Sen...

139 views

2024-10-11

Embeddings, Context, and the Static Embeddings in Sentence Transformers

#context#den#die#embeddings#ich#nnen#sentence#sie#static#transformers#wie
Video

Hyperparameter Optimization

137 views

2024-06-20

Hyperparameter Optimization

#algorithms#hyperparameter#hyperparameters#learning#like#optimization#search
Video

Pricing Optimization with Machine Learning - A funny ...

134 views

2024-07-16

Pricing Optimization with Machine Learning - A funny summary

#data#funny#learning#machine#one#optimization#pay#price#pricing#summary#use

Crowdsource labor for #ai #machinelearning - longer v...

129 views

2022-01-22

Crowdsource labor for #ai #machinelearning - longer video explaining this coming out later today.

#ai#crowdsource#labor#listen#longer#machinelearning
Video

MobileLLM from Meta is full of efficient architecture...

128 views

2024-07-12

MobileLLM from Meta is full of efficient architecture ideas for LLMs

#architecture#blocks#efficient#full#ideas#meta#mobilellm#model#swiglu#using
Video

Lets talk about why enterprises are considering alter...

126 views

2023-03-18

Lets talk about why enterprises are considering alternatives to chatGPT by looking to open source. An open source strategy can affect lots o...

#api#chatgpt#data#don#enterprises#know#lets#open#opensource#source#talk
Video

Starting to see people productionizing GPT-3 workflow...

120 views

2023-03-11

Starting to see people productionizing GPT-3 workflows. I am a bug fan of using large language midels. Here is how one data science dealt wi...

#datascience#gpt3#large#largelanguagemodels#machinelearning#natdev#see#starting
Video

Retrieval Augmented Generation - What it is and how i...

119 views

2024-05-10

Retrieval Augmented Generation - What it is and how it works

#algorithms#augmented#generation#machine#machinelearning#retrieval#tools#understand#visualizations#works
Video

Dealing with over plotting, another visualization tip...

113 views

2023-01-08

Dealing with over plotting, another visualization tips from data to viz #datascience #machinelearning #statistics #datavisualization

#data#datascience#datavisualization#dealing#groups#like#lot#machinelearning#plotting#statistics#use
Video

OpenAI plugins! Lets get everyones APIs working with ...

109 views

2023-03-23

OpenAI plugins! Lets get everyones APIs working with LLMs! This isa good thing. #largelanguagemodels #langchain #openai #datascience #machin...

#chatgpt#datascience#huggingface#langchain#largelanguagemodels#machinelearning#models#openai#stablediffusion#text2video
Video

AI Literacy, Q2, Driverless cars #ai #datascience #dr...

109 views

2022-01-18

AI Literacy, Q2, Driverless cars #ai #datascience #driverless #cars #tesla #fsd #alleninstitute

#alleninstitute#cars#datascience#driverless#fsd#tesla
Video

Probe the data #dataanalysis #datascience #statistics...

104 views

2022-08-01

Probe the data #dataanalysis #datascience #statistics #bias

#bias#code#data#dataanalysis#datascience#huggingface#machinelearning#model#probe#spreadsheet#statistics
Video

Try out these examples for yourself and lots more are...

101 views

2023-01-31

Try out these examples for yourself and lots more are available. It’s scary cool how these models are working. #datascience #machinelearni...

#datascience#flant5#gpt3#largelanguagemodels#machinelearning#models#problem#reasoningwithpeople#solve
Video

Learning curves, it’s a technique I use all the tim...

95 views

2022-11-16

Learning curves, it’s a technique I use all the time when training models. Thanks to Todd C for showing me the best way to explain this. #...

#bigdata#datascience#galactica#learning#like#machinelearning#meta#model#say#statistics#stuff#technique
Video

Human Expertise in Text to SQL (Databricks, Snowflake...

91 views

2024-09-01

Human Expertise in Text to SQL (Databricks, Snowflake, and Numbers Station)

#automated#databricks#engineering#expertise#feature#github#human#notebook#openfe#snowflake#sql#text
Video

My data science setup for now #datascience #codetok #...

87 views

2022-08-20

My data science setup for now #datascience #codetok #python #rstats #posit #vscode #googlecolab #digitalocean #conda

#codetok#datascience#posit#python#rstats#vscode
Video

Data drift analysis is a must for production workload...

79 views

2023-03-13

Data drift analysis is a must for production workloads. Here is Uber’s D3 system fie automated drift analysis. This video covers types of ...

#automated#data#datadrift#datascience#drift#issues#machinelearning#mlops#prophet
Video

Just how smart is ChatGPT and other #largelanguagemod...

79 views

2023-01-04

Just how smart is ChatGPT and other #largelanguagemodels? Big Bench is a set of benchmark tests to asses the performance of the models. And ...

#chatgpt#datascience#largelanguagemodels#machinelearning#models#smart
Video

Challenging Benchmarks for LLMS: MUSR and Connections

75 views

2024-07-04

Challenging Benchmarks for LLMS: MUSR and Connections

#benchmarks#challenging#connections#llms#musr#reasoning#still
Video

ABCs of AI for Machine Learning and Generative AI - A...

73 views

2024-07-13

ABCs of AI for Machine Learning and Generative AI - Anything But Chatbots

#abcs#anything#build#chatbots#generative#learning#machine#rag
Video

No big deal, use visualization #stats #datascience #d...

73 views

2022-01-14

No big deal, use visualization #stats #datascience #datasaurus #datascience #analytics #anscombe #visualization

#analytics#anscombe#datasaurus#datascience#stats#visualization
Video

Twitter open sourced it's recommendation algorithm. I...

64 views

2023-03-31

Twitter open sourced it's recommendation algorithm. It's fun to look at someone else's production code and will be useful to people studying...

#datascience#machinelearning#open#recommenders#sourced#twitter
Video

Data Scientist versus Data Analyst - Police Misconduct

63 views

2024-05-27

Data Scientist versus Data Analyst - Police Misconduct

#analyst#data#learning#machine#misconduct#model#police#scientist#versus
Video

Replying to @anansaadi OpenAssistant is an open sourc...

63 views

2023-02-19

Replying to @anansaadi OpenAssistant is an open source project that aims to provide a chat based assistant that connects to other sources of...

#chatgpt#datascience#feedback#help#information#machinelearning#open#openai#openassistant
Video

Applying a classic methodology of ablation when worki...

63 views

2022-11-12

Applying a classic methodology of ablation when working with stable diffusion prompts. Ablation is very common in many techniques to underst...

#ablation#ablationsurgery#analytics#datascience#functions#loss#machinelearning#regression#rsme#stablediffusion#statistics
Video

How I added a list of my Tik Tok videos to my web site

61 views

2024-04-02

How I added a list of my Tik Tok videos to my web site

#added#buzzlytics#chatgpt#data#list#rajivshah#tik#tok#videos#web#website
Video

Cleanlab is open source and will improve your data qu...

61 views

2023-01-29

Cleanlab is open source and will improve your data quality. It’s so underrated. This was hard to record vertically, so go try it out. #dat...

#cleanlab#confidentlearning#dataquality#datascience#datasets#explainability#labelerror#machinelearning#synthetic#syntheticdata
Video

Explaining AI

60 views

2023-06-26

Explaining AI

#alpaca#datascience#explaining#largelanguagemodels#llama#machinelearning#objaverse
Video

Models that cheat, take shortcuts, and leak informati...

58 views

2023-01-03

Models that cheat, take shortcuts, and leak information are all part of the data scientist life style. Ever my data scientist has a story li...

#data#datascience#machinelearning#model#models#scientist
Video

tiktok e2a990da04ca9451707be3db1b1bbef66c5254a5

2024-09-30

#block#check#earlier#latest#videos#world
Video

tiktok d7bef697298bb1d985ca3c6f7cbd11ed7ae7d689

2024-09-01

#aiethics#cigna#human#loop#news#tesla
Video

tiktok 417158c1e75afdec1eb5bc35f46b7b138c6056ca

2024-08-31

#performance#snowflake#source#sql#text#work
Video

tiktok a6aa66a4f29979c305632aa3bf16159062777ba7

2024-08-30

#chatgpt#datascience#machinelearning#reinforcementlearning#rlhf#techtok
Video

tiktok 0e623c2cf183c1083703c98f65cd692b76e6aa5b

2024-08-29

#evaluatingllms#evaluation#language#large#largelanguagemodels#models
Video

tiktok 4d213362842492fbc692ca591489cdcec809f3f7

2024-08-28

#clustering#kmeans
Video

tiktok b9a370cab2d3b0936212e9c2ce29d961e2ab3cd4

2024-08-22

#courses#data#don#getting#overspend#science
Video

tiktok 9df3656300e252c0d3de70d269aad4bc567806c0

2024-08-20

#key#llms#models#planning#reasoning#weaknesses
Video

tiktok 5355e480148d4573f3462165cc2a3f67c5059ee6

2024-08-11

#hungry#makes#mmm
Video

tiktok 50b157f75104d353c7f4be7fc18ad0a497056b89

2024-08-09

#das#daten#die#ich#ist#sich
Video

tiktok b659e06c1dd534fb395538bc618be1f8b7543e06

2024-08-06

#datos#est#los#para#pensando#que
Video

tiktok afe4c9419d63f900e158d47033b367e3cccd8ab9

2024-08-04

#das#gpus#ich#ist#pro#sie
Video

5 tiers of data centers

2024-07-18

5 tiers of data centers

#centers#data#higher#pay#tier#tiers
Video

tiktok 0743cb2a2ae297e3c8ccc35ba27da2ec9ba16ecf

2024-07-19

#openai#revenue#wow
Video

tiktok 0b122ac2e9694c92b91164414905566fdd2fc14f

2024-07-12

#able#behavior#chicago#crime#numbers#statistics
Video

tiktok 4cf7ae463f2bad712f25c9e362e25157c65521a6

2024-07-05

#also#cold#concepts#look#see#tensorboard
Video

tiktok 8c066ba61667767f6b38fe18b5dd8e1e281f6683

2024-07-01

#ai#aiexplained#customers#datascience#machinelearning#years
Video

7 Baseline Models: Time Series: Previous Value Anomal...

2024-06-22

7 Baseline Models: Time Series: Previous Value Anomaly: p99 Search: BM25 Recommendation: Popularity Buy recommendations: last viewed Classif...

#baseline#models#recommendation#search#use#value
Video

tiktok bb7abbecd2c5cd54f01726919f43c5d803879533

2024-06-06

#data#decomposition#let#series#start#time
Video

tiktok 19e9c4dc3117a7c860ebb3df3e3f7ce69ef12488

2024-06-04

#different#leaderboards#many#model#models#use
Video

tiktok 954676c979b9f2656d789b3e0f97f299509c178d

2024-06-01

#algorithm#approaches#data#failed#like#model
Video

tiktok 775eb1bfdcf68644342be85b1c8c554005253e0c

2024-05-30

#blog#blogging#jekyll#posit#post#quarto
Video

tiktok 7fd83346daf58d13afdb3984dcb33cc9195efd7a

2024-05-29

#doesn#hallucinations#legal#llms#paragraph#rag
Video

tiktok 00c676f59f8cf572ebe8336fe2df7c4b0bd5ca5e

2024-05-27

#large#like#live#open#source#world
Video

tiktok cf32a89cf851869db4c43616089a42d2cf26b89f

2024-05-21

#content#copyright#datascience#datasets#laoin#llama
Video

tiktok d00bf504b17c83aa06976273e93d65b7b35d59e6

2025-01-13

#ast#code#leak#leaks#resource#right
Video

The politics of ChatGPT, it’s no different than any...

2022-12-27

The politics of ChatGPT, it’s no different than any other technology and is not neutral. If you want a simple explanation of how ChatGTP w...

#ago#chatgpt#datascience#machinelearning#openai#politics#technologyethics#themes#timeless#two#years
Video

tiktok 990843282321f08d5aaf5a5932e6d003f6474e9d

2025-01-03

#data#let#like#lot#use#viz
Video

tiktok de97fd3138a8f8d4d080bb166276cd00c49af5ca

2025-01-02

#baseline#models#recommendation#search#use#value
Video

tiktok 64b0a4d7321788a7ba7541b749e80ecaffb9cd6f

2024-12-23

#data#got#learning#really#science#somebody
Video

tiktok 68e99405e8971cfeeff7d1136218c61bfcd17415

2024-12-20

#ready
Video

My take on Objaverse Llama and Alpaca. Not a lot of r...

2023-03-25

My take on Objaverse Llama and Alpaca. Not a lot of respect for copyright or contract terms. #largelanguagemodels #datascience #machinelearn...

#alpaca#data#datascience#largelanguagemodels#lead#llama#machinelearning#models#objaverse#open#use#vicuna
Video

Ensembling is key method in machine learning. This vi...

2023-03-14

Ensembling is key method in machine learning. This video introduces ensembling through majority voting. #datascience #machinelearning #ensem...

#datascience#ensembling#kaggle#key#langchain#langflow#largelanguagemodels#machinelearning#majorityvoting#use
Video

Best machine learning tools for competitions. Lots of...

2023-03-09

Best machine learning tools for competitions. Lots of great stuff here. #datascience #machinelearning #python #codetok

#accuracy#best#codetok#datascience#ensembling#machine#machinelearning#majority#models#python#three#voting
Video

tiktok 320b9177d62d2a93d69107a70c76c21db5a3060e

2025-01-14

Video

tiktok a69842409e16f28d371db823dc9ef60f1f15a440

2025-01-11

Video

tiktok 2b77820d46cd2ce156e7a60d75267aaf69e765fb

2025-01-10

Video

tiktok 5fbfda71a12d1952651ade3a93099c09706745ce

2025-01-09

Video

tiktok a3ef42d8bf23f62babe6a6b4675133b84e859427

2025-01-08

Video

tiktok d4771ab2c38fc049f245c2909abb761afe720662

2025-01-07

Video

tiktok 8f831528b60167821cb18be81442f52f30a375c0

2025-01-05

Video

tiktok d5d63e99298efa6c0423da9917fb4c0a2e3eab67

2025-01-04

Video

tiktok 0d4cfd4336634201f295e09f99e92ef5fa6a54c7

2024-12-19

#agents#astra#project#sculpture#using#world
Video

tiktok 903c15d39aee91b5a50221f5e0de892c298d05f0

2024-12-19

Video

tiktok 2c87e26ed66b006024fdc949b270eb5e9435b153

2024-12-17

Video

tiktok 1c7fa2839202a58d61ba4c7afa4a28eae107cef1

2025-03-21

#accurate#hours#model#next#spend#spent
Video

tiktok 124c9d35666bbd220694533bad88908f4cb1bae7

2024-12-12

#data#going#let#team#time#want
Video

tiktok ba59e0260fcb7eee8f65083fe8390511e75acf3d

2024-10-30

#data#distance#levinci#metrics#science
Video

tiktok ab70212cbc5914f13b1643b96d76fc60de8dea1d

2024-10-29

#data#der#die#man#revolution#science
Video

tiktok 08ffeb08df0e47f4d3fbbc5c7120071a912b9276

2024-10-28

#evaluations#half#magic#million#neural#ran
Video

tiktok 96cb6701682ae307ce7eae6b28570743ec7bfafb

2024-10-25

#das#ist#modell#nicht#sie#und
Video

tiktok 425c38061777a01ab55105ee38c0b382b0884be6

2024-10-15

#baselines#remember#rsc
Video

tiktok caa90da79fbc05977cd1c19f1a42e9568bfa8bc6

2024-10-13

#ber#datascience#die#distanz#ist#man
Video

tiktok 3f0c6902fa5a79371eb5d6cda9f068ad9f77c706

2024-10-11

#des#diction#dictions#donn#les#pour
Video

tiktok 48d3b4d7d189f76a66980535eae81feeb890348e

2024-10-06

#alternatives#copilot#github#gpt#tgi
Video

tiktok b030263c4a5b1a509660564c61b6c54d2b265b4d

2024-10-02

#algoritmo#anomal#datos#detecci#mejor#que
Video

tiktok 2b26a9a7fe39c1f929b2bce1f79b9b6987aa601d

2024-09-30

#die#ein#eine#ist#und#wissen
Video

tiktok ab5517dd732ef557e417f11c56b665c433d22982

2024-09-26

#das#die#haben#sie#und#wie
Video

tiktok 1b5cf92f9eea6609cd7e9a85d4815881356dace8

2024-09-25

#cat#des#donn#est#une#vous
Video

tiktok d3e1d6b2b51a4f6d63a13025820aa59e02e742b1

2024-09-20

#abs#arxiv#detection#object#org#yolo
Video

tiktok 709161c3d2236bdfbdafab1b6ad710a6d29ef9ad

2024-05-23

#data#four#good#gpt#labeling#well
Video

tiktok ffcd51d466fbaf06c6b067719c672165f926cd17

2024-05-25

#bojan#hopefully#linkedin#post#saw#tunguz
Video

tiktok 844e96adce97f89ef0cc605924da96b094d0fe0f

2024-05-17

#documents#get#largelanguagemodels#rag#retrievalaugmentedgeneration#snowflake
Video

tiktok b9c0b9b5b5823a6d96689096093f16871b5f156e

2024-05-04

#credit#entropy#know#liability#look#rating
Video

tiktok 325733951ba1b2b080519863609b58df0876fcc6

2024-05-02

#data#largelanguagemodels#llama#really#training#trainingdata
Video

tiktok a5562e05a4234cd0e1c6c939c12f26c0cc677602

2024-04-27

#chatgpt#datascience#machinelearning#reinforcementlearning#rlhf#techtok
Video

tiktok 54aba190d1b3c5efca9aef8da61c0123890bc641

2024-04-23

#databases#datascience#embeddings#faiss#pinecone#vector
Video

tiktok 35db4b850d175713598b9f83c91925299673d3a8

2024-04-18

#crowdai#data#leakage#machinelearning#sarcos#target
Video

tiktok b8f5c81c0a4f7eb4981a87d3af5d05a2b3ef5f5b

2024-04-14

#baseline#baselinemodel#benchmark#datascience#machinelearning#model
Video

tiktok 39fe20eb758016a4a43ecd1d3b552263c0a86de4

2024-04-12

#cohere#model#multi#tool#tools#use
Video

tiktok 2de5791aaafd1430249a1f1b6a97b40454657c60

2024-04-10

#datascience#machinelearning#matrixalgebra#one#singularvaluedecomposition#svd
Video

Data Centric AI helps to remind us not to focus too m...

2023-02-23

Data Centric AI helps to remind us not to focus too much on the model or algorithms. In real data science, it’s more about understanding y...

#cleanlab#data#datacentricai#datascience#erroranalysis#machinelearning#model
Video

Wrap up of current events going on with chat includin...

2023-02-16

Wrap up of current events going on with chat including #openai #chatgpt #bing #amazon #datascience #machinelearning

#amazon#bing#chatgpt#datascience#like#machinelearning#models#openai
Video

tiktok c50c73941e834e5984e3e0c1e90117d5cbebf631

2023-02-19

#data#let#lot#mediocre#science#videos
Video

Roundup of this weeks news, let me know if you all li...

2023-02-10

Roundup of this weeks news, let me know if you all like this format. I had a lot of fun making this. #datascience #machinelearning #dumbtech...

#datascience#dumbtechnews#google#machinelearning#microsoft#openai
Video

OpenAI AI classifier is a great example to remind peo...

2023-02-04

OpenAI AI classifier is a great example to remind people of the limitations when detecting rare events. It’s not intuitive, so I showed th...

#baseratefallacy#cheating#classifier#competition#datascience#detecting#kaggle#machinelearning#openai#ottocompetition#statistics
Video

Picking the right GPU for deep learning based on Tim ...

2023-01-17

Picking the right GPU for deep learning based on Tim Dettmers blog post.

#gpu#makes#right#sense#tim#using
Video

GPT3.5 takes the bar exam with very little tuning. It...

2022-12-30

GPT3.5 takes the bar exam with very little tuning. It does pretty well. #gpt #datascience #machinelearning #barexam #law

#barexam#chatgpt#datascience#gpt#gpt3#large#law#like#machinelearning#openai#reasoning
Video

Clustering with k-means. This skit was inspired by th...

2022-12-31

Clustering with k-means. This skit was inspired by the examples in Schubert paper on stop using the elbow criterion for kmeans. Any other cl...

#chatgpt#chatgptp#clustering#datascience#kmeans#like#machinelearning#machinelearnjng#means#openai#statistics
Video

Models that cheat, take shortcuts, and leak informati...

2023-01-03

Models that cheat, take shortcuts, and leak information are all part of the data scientist life style. Ever my data scientist has a story li...

#bar#cheat#data#datascience#exam#gpt#machinelearning#models#openai#pretty#scientist
Video

Meta’s Cicero AI that plays Diplomacy and knows how...

2022-11-24

Meta’s Cicero AI that plays Diplomacy and knows how to get its way with people. #datascience

#datascience#diplomacy#figured#get#meta#trained
Video

DiffusionDet is bringing generative approaches to obj...

2022-11-21

DiffusionDet is bringing generative approaches to object detection #computervision

#computervision#detection#generative#object#paper#shows
Video

Regularization is something I need more in my everyda...

2022-11-19

Regularization is something I need more in my everyday life.

#hammering#maybe#regularization#shirt#shoes#videos
Video

Automatic speech recognition using transformers. It i...

2022-11-17

Automatic speech recognition using transformers. It is that easy!

#automatic#recognition#speech#start#transformers#using
Video

Editing facts in large language models. An exciting a...

2022-11-06

Editing facts in large language models. An exciting approach that is probing LLMs. #largelanguagemodels #datascience #machinelearning

#datascience#editing#facts#large#largelanguagemodels#machinelearning
Video

Software exec at the end is the best. Your quick intr...

2022-11-01

Software exec at the end is the best. Your quick intro to patents, trademarks, and licenses. I see too many comments where people get confus...

#licenses#patents#people#software#trademarks#use
Video

Stable diffusion for markup. This is about better und...

2022-10-30

Stable diffusion for markup. This is about better understanding how to go from text to image, not a practical solution. #stablediffusion #da...

#datascience#diffusion#markup#model#stablediffusion#statistics
Video

Mixing in some law with data science. #craiyon #dalle...

2022-10-30

Mixing in some law with data science. #craiyon #dallemini #stablediffusion #texttoimage #machinelearning #datascience

#craiyon#dallemini#dolly#open#stablediffusion#trademark
Video

Reminder to visualize your data #datascience #datavis...

2022-10-30

Reminder to visualize your data #datascience #datavisualization #statistics

#analyze#data#datascience#datavisualization#statistics#visualize
Video

tiktok c117fdd786b5fa4b9ca9bf6e63bb5784bbaa2f2b

2023-04-05

#datascience#emilyocasio#explaining#machinelearning#promptengineering#societyforscience
Video

tiktok 09d58e11fb8f71d12e407b242351d657cb0331c5

2023-04-02

#datascience#huggingface#machinelearning#models#stablediffusion#text2video
Video

tiktok 1bfd35d388a29397ed0c8b2413093d3e6846b824

2023-04-01

#datascience#flant5#largelanguagemodels#lora#machinelearning#peft
Video

tiktok 91cfc737c7bfbab33af3e1938be71a135a91b710

2023-03-27

#datascience#gpt3#gpt4#largelanguagemodels#machinelearning#temperature
Video

tiktok 0386ff01a07604d0d2939e3c1a395093b9918e48

2023-03-26

#chatgpt#codex#datascience#machinelearning#pair#pairprogramming
Video

tiktok 23ef998b251599c7ad0769b331d3c7388ef46b0b

2023-03-25

#data#enterprises#lets#open#source#talk
Video

tiktok 92b2c950113e3e344c676be2387696f7e2760e53

2023-03-22

#datascience#imagecaptioning#machinelearning#nvidia#prismer#visualquestionanswering
Video

tiktok 7600368163371f79348dbde7355ad7b2624f8a25

2023-03-22

#datascience#gpt4#langchain#langflow#largelanguagemodels#machinelearning
Video

tiktok b01c415cbe197395d672a1697f15f8bcce0f6c62

2023-03-19

#datadrift#datascience#drift#machinelearning#mlops#prophet
Video

tiktok 57f46e0b2a673ca85ce1c18b516dc3fb951315d4

2023-03-18

#datascience#gpt3#largelanguagemodels#machinelearning#see#starting
Video

tiktok 9afa0af2e2d89cec29cc5ce519ad2fce9f569801

2023-03-17

#datascience#gpt3#largelanguagemodels#machinelearning#nat#natdev
Video

tiktok df04d2a352228dd8fe797b2f60216402bea19596

2023-03-15

#best#codetok#datascience#machine#machinelearning#python
Video

tiktok b6bc845ad22ac829802bc2b587b1104739cb7b37

2023-03-15

#datascience#ensembling#kaggle#key#machinelearning#majorityvoting
Video

tiktok d4f56cd69671cddaf9659ed34261197c90be41f9

2023-03-13

#datascience#fonts#generativeai#machinelearning#stablediffusion#word
Video

tiktok 0e24ee241dd470ca7526ee5f0b53e99e277820d0

2023-03-11

#datascience#largelanguagemodels#less#machinelearning#meta#opensource
Video

tiktok 5cc3823f5e80faa4ba21bb0be4549d9f381c8381

2023-03-10

#datascience#google#machinelearning#meta#openai#stabilityai
Video

tiktok ce63b22e2a11d8eac544f2b46a8723093f0f35bd

2023-03-02

#cleanlab#data#datacentricai#datascience#erroranalysis#machinelearning
Video

tiktok 7c4249bfb070958df2e8dfd21c1651cf7da25266

2023-02-27

#datascience#gpt#gpt3#gpt4#machinelearning#speculating
Video

Is explainability important for you? #datascience #ex...

2022-08-06

Is explainability important for you? #datascience #explainability #interpretability #statistics #codetalk #machinelearning

#codetalk#datascience#explainability#interpretability#machinelearning#statistics
Video

tiktok b16b91c28d975c34028f8e7f66812744ed670595

2022-07-27

#adversarial#codetok#datacentricai#datascience#huggingface#offering
Video

Learn about foundational models, especially in #nlp #...

2022-04-23

Learn about foundational models, especially in #nlp #naturallanguageprocessing #datascience #deeplearning #analytics #techtok #openai

#analytics#datascience#deeplearning#naturallanguageprocessing#nlp#techtok
Video

Loss Functions - simple example of MAE versus RSME #d...

2022-08-30

Loss Functions - simple example of MAE versus RSME #datascience #statistics #analytics #codetok #regression

#analytics#codetok#datascience#loss#regression#statistics
Video

Rust for machine learning. It’s useful in some case...

2022-09-25

Rust for machine learning. It’s useful in some cases for ML, but learn python first. #datascience #codetok #python #machinelearning #rust

#codetok#datascience#machine#machinelearning#python#rust
Video

Diffusion models for markup. #datascience #machinelea...

2022-10-13

Diffusion models for markup. #datascience #machinelearning #stablediffusion

#datascience#diffusion#machinelearning#markup#models#stablediffusion
Video

AI that makes you feel better. The paper is Inducing ...

2022-10-14

AI that makes you feel better. The paper is Inducing Positive Perspectives with Text Reframing. You can find a demo over at ü§ó hugging fac...

#codetok#datascience#machinelearning#makes#positive#reframing
Video

TabPFN revolution in data science. Please don’t you...

2022-10-22

TabPFN revolution in data science. Please don’t your time on all this hype. Every week there is a revolution announced on Twitter. Ignore ...

#datascience#machinelearning#revolution#statistics#tabpfn#time
Video

Reminder to visualize your data with one of my favori...

2022-10-29

Reminder to visualize your data with one of my favorites #anscombesquartet #datavisualization #datascience #statistics

#anscombesquartet#datascience#datavisualization#reminder#statistics#visualize
Video

Software exec at the end is the best. Your quick intr...

2022-10-30

Software exec at the end is the best. Your quick intro to patents, trademarks, copyright, and licenses. I see too many comments where people...

#best#end#exec#intro#quick#software
Video

Checking out Flan T5 large language models. Let me kn...

2022-11-09

Checking out Flan T5 large language models. Let me know what wisdom you can find in this model. #machinelearning #datascience #largelanguage...

#checking#datascience#flan#large#largelanguagemodels#machinelearning
Video

New style of content, let me know if you want more li...

2022-11-13

New style of content, let me know if you want more like this. Predict sentiment #machinelearning #datascience #transformers #huggingface

#datascience#huggingface#machinelearning#new#style#transformers
Video

Learning curves, it’s a technique I use all the tim...

2022-11-16

Learning curves, it’s a technique I use all the time when training models. Thanks to Todd C for showing me the best way to explain this. #...

#bigdata#datascience#learning#machinelearning#statistics#technique
Video

Automatic Speech recognition in 3 lines of code using...

2022-11-17

Automatic Speech recognition in 3 lines of code using wav2vec2 in transformers #datascience #machinelearning #huggingface #automaticspeechre...

#asr#automatic#automaticspeechrecognition#datascience#huggingface#machinelearning
Video

Galactica by meta. Cool model, poor form on sharing i...

2022-11-17

Galactica by meta. Cool model, poor form on sharing it out. #datascience #machinelearning I feel for students, it was going to write a lot o...

#cool#datascience#galactica#machinelearning#meta#model
Video

I need to focus on adding more Regularization to my l...

2022-11-19

I need to focus on adding more Regularization to my life. #datascience #statistics #regularization

#adding#datascience#focus#need#regularization#statistics
Video

Meta’s Cicero for playing Diplomacy is impressive a...

2022-11-23

Meta’s Cicero for playing Diplomacy is impressive and a bit scary. #statistics #datascience #machinelearning #diplomacy

#cicero#datascience#diplomacy#machinelearning#meta#statistics
Video

A couple of examples of what not to do and what you s...

2022-12-23

A couple of examples of what not to do and what you should do when presenting your data science results to the business. #datascience #stati...

#couple#datascience#enterpriseai#examples#machinelearning#statistics
Video

The politics of ChatGPT, it’s no different than any...

2022-12-27

The politics of ChatGPT, it’s no different than any other technology and is not neutral. If you want a simple explanation of how ChatGTP w...

#chatgpt#datascience#machinelearning#openai#politics#technologyethics
Video

Dtreeviz 2.0 - Visualizing Decision Trees

2022-12-28

Dtreeviz 2.0 - Visualizing Decision Trees

#decision#dtreeviz#trees#visualizing
Video

GPT3.5 takes the bar exam with very little tuning. It...

2022-12-30

GPT3.5 takes the bar exam with very little tuning. It does pretty well. #gpt #datascience #machinelearning #barexam #law

#barexam#datascience#gpt#gpt3#law#machinelearning
Video

Clustering with k-means. This skit was inspired by th...

2022-12-31

Clustering with k-means. This skit was inspired by the examples in Schubert paper on stop using the elbow criterion for kmeans. Any other cl...

#clustering#datascience#kmeans#machinelearning#means#statistics
Video

Image captioning models - GIT from Microsoft and BLIP...

2023-01-05

Image captioning models - GIT from Microsoft and BLIP from salesforce #datascience #machinelearning #imagecaptioning

#captioning#datascience#image#imagecaptioning#machinelearning#models
Video

Scaling laws help us figure out how manage the amount...

2023-01-07

Scaling laws help us figure out how manage the amount of training data versus the model size. DeepMind showed with Chinchilla by using more ...

#datascience#deepmind#largelanguagemodels#machinelearning#nvidia#openai
Video

Dealing with over plotting, another visualization tip...

2023-01-08

Dealing with over plotting, another visualization tips from data to viz #datascience #machinelearning #statistics #datavisualization

#datascience#datavisualization#dealing#machinelearning#plotting#statistics
Video

Using LangChain with GPT3. I am seeing lots of cool d...

2023-01-14

Using LangChain with GPT3. I am seeing lots of cool demos based on LangChain and needed to make I covered it. It’s an easy way to take adv...

#datascience#gpt3#langchain#largelanguagemodels#machinelearning#using
Video

Picking a GPU for deep learning based on Tim Dettmers...

2023-01-16

Picking a GPU for deep learning based on Tim Dettmers classic blog post. #datascience #machinelearning #deeplearning #gpu

#datascience#deep#deeplearning#gpu#machinelearning#picking
Video

How companies use your data for training models will ...

2023-01-19

How companies use your data for training models will be a big issue this year. GitHub is being sued for Copilot and Hugging Face has been bu...

#bigcode#companies#copilot#datascience#github#huggingface
Video

Google’s sparrow is the rumored competitor to OpenA...

2023-01-21

Google’s sparrow is the rumored competitor to OpenAI ChatGPT. Check out the paper to see lots of examples of it chatting. It looks really ...

#chatgpt#datascience#google#googlesparrow#machinelearning#openai
Video

Should you take the time to learn Kubernetes as a dat...

2023-01-23

Should you take the time to learn Kubernetes as a data scientist? Or you already overloaded learning data science? #datascience #machinelear...

#data#datascience#kubernetes#machinelearning#take#time
Video

I can’t make this stuff up. OpenAI released their c...

2023-01-31

I can’t make this stuff up. OpenAI released their classifier and I saw all these messages about how ineffective it is. Wanted to get this ...

#chatgpt#datascience#get#gpt3#huggingface#machinelearning#make#openai#stuff
Video

My second try to explain in context learning or few s...

2023-01-27

My second try to explain in context learning or few shot learning with large language models. It’s very cool and why these models are so e...

#datascience#fewshotlearning#gpt3#incontextlearning#largelanguagemodels#machinelearning
Video

I can’t make this stuff up. OpenAI released their c...

2023-01-31

I can’t make this stuff up. OpenAI released their classifier and I saw all these messages about how ineffective it is. Wanted to get this ...

#datascience#get#machinelearning#make#openai#stuff
Video

How enterprises are dealing with ChatGPT it’s a pre...

2023-02-05

How enterprises are dealing with ChatGPT it’s a pretty familiar cycle of grief. The good thing is it does open up lots of cool use cases. ...

#chatgpt#datascience#dealing#enterprisearchitecture#enterprises#machinelearning
Video

Climax, a new transformer based model for predicting ...

2023-02-07

Climax, a new transformer based model for predicting weather and climate forecasting. Great example of the flexibility of transformers based...

#based#climatemodel#climax#datascience#machinelearning#transformers
Video

Random forests and their ease of use are important in...

2023-02-18

Random forests and their ease of use are important in understanding modern data science. #datascience #machinelearning #statistics #randomfo...

#amazon#bing#chatgpt#dataprep#datascience#decisiontree#machinelearning#openai#randomforest#statistics
Video

Replying to @anansaadi OpenAssistant is an open sourc...

2023-02-19

Replying to @anansaadi OpenAssistant is an open source project that aims to provide a chat based assistant that connects to other sources of...

#chatgpt#datascience#machinelearning#open#openai#openassistant
Video

Feature engineering and data preprocessing are an imp...

2023-02-27

Feature engineering and data preprocessing are an important part of the machine learning process. #datascience #machinelearning #featureengi...

#data#datascience#engineering#feature#featureengineering#machinelearning
Video

Pandas 2.0 combing with arrow. A short recap on how i...

2023-03-01

Pandas 2.0 combing with arrow. A short recap on how it fits in with polars, dplyr, and data.table. #datascience #machinelearning #rstats #py...

#datascience#dplyr#machinelearning#pandas#polars#rstats
Video

ChatGPT price drop. Let’s break down how much the p...

2023-03-02

ChatGPT price drop. Let’s break down how much the price dropped, how OpenAI could drop the price, the effects on performance, what is goin...

#chatgpt#cohere#datascience#langchain#machinelearning#openai
Video

OpenAI plugins! Lets get everyones APIs working with ...

2023-03-23

OpenAI plugins! Lets get everyones APIs working with LLMs! This isa good thing. #largelanguagemodels #langchain #openai #datascience #machin...

#chatgpt#datascience#langchain#largelanguagemodels#machinelearning#openai

© Rajiv Shah. All Rights Reserved.