Let’s play a guessing game. I’m going to choose a big set of numbers. Like a billion jillion numbers. # This is R code. library(tidyverse) generator_mean = floor(runif(1) * 10000 + 10) #SECRETS generator_standard_deviation = floor(runif(1) * 30 + 10) #MORE SECRETS my_numbers = rnorm(10000000, generator_mean, generator_standard_deviation) #OK fine it'll be 10 million instead of a billion jillion. My set of numbers has an average that you’re trying to guess.

Continue reading

rmarcksharpdown

This is an R Markdown document… library(tidyverse) data_frame(X = rnorm(1000)) %>% ggplot(aes(X)) + geom_histogram() And this is some c# code… Console.WriteLine("Hello World!"); ## Hello World! 🤩 …that the document just executed! 🤩 And here’s some more c# code that talks across different Rmd code blocks… var greatDay = "What a great day!"; greatDay = greatDay + " I hope yours is good too! ❤️🧡💚💙💜"; Console.WriteLine(greatDay); ## What a great day!

Continue reading

Gratitude

I’ve been meditating lately. I started in December, and I haven’t done it every day, but I enjoy it. The topic of this morning’s meditation was gratitude. It led me through feeling grateful for different things. Something someone did for me. Something someone I don’t know did. Something from nature. Something I did. Something small. Something big. My task for the week was to deploy an A/B test of a new job recommendation algorithm.

Continue reading

Last time we learned about a method of dimensionality reduction called random projection. We showed that using random projection, the number of dimensions required to preserve the distance between the points in a set is dependent only upon the number of points in the set and the maximum acceptable distortion set by the user. Surprisingly it does not depend on the original number of dimensions. The proof that random projections work is hard to understand, but the method is very simple to implement in just a few steps.

Continue reading

So, there’s this bit of math called the Johnson-Lindenstrauss lemma. It makes a fairly fantastic claim. Here it is in math-speak from the original paper… Fantastic, right? What does it mean in slightly more lay speak? The Fantastic Claim Let’s say you have a set of 1000 points in a 10,000 dimensional space. These points could be the brightness of pixels from 100x100 grayscale images. Or maybe they’re the term counts of the 10,000 most frequent terms in a document from a corpus of documents.

Continue reading

Author's picture

Jason Punyon

Chaotic Good w a splash of Data. Dad x2. On sabbatical from Stack Overflow. He/him.

Staff Data Engineer at Stack Overflow