I’ve been looking to revisit my baby name analysis and wanted to play around a bit more with Shiny, and finally found the time to do it. The site is up on Shinyapps: https://pedramnavid.shinyapps.io/shinybabies/ Essentially, it looks at the trend in baby names by gender in Ontario, Canada. There are three views, the first looks at the changing trend in suffixes, the second explores uniqueness, and the last looks at gender-neutrality.
With Trump’s victory, I thought it would be interesting to scope out some Trump-related datasets and get some practice with different machine learning algorithms. I’ve seen some work with generative text with Markov Chains, but thought a Recurrent Neural Network might be a little more fun to play with. Web Scraping Grabbing a large enough dataset was the first problem to solve. I did find a Github that had some Trump speeches but the data was 3 months old as of this post.
I often get asked by co-op students at work about how they can get started with using R. While sites like Kaggle are great for finding lots of datasets and entering competitions to see how many tenths of a point you can extract from your model, my advice to those starting it is to pick a topic or question that actually interests you. It’s a hundred times easier to do an analysis on something that you’ve been pondering than on fifty columns of anonymized, standardized numbers.
Overview and Definitions The purpose of A/B testing is to determine through the use of statistical methods whether an experiment generates enough of a practically significant effect to support implementation. This is not as simple as seeing if the rates of two different groups are different, because of the inherent randomness in sampling from a population. Consider this toy example: library(scales) set.seed(1234) pop_1 <- rnorm(100, 0, 1) pop_2 <- rnorm(100, 0, 1) paste("The mean of pop_1 is: ", comma(mean(pop_1))) ##  "The mean of pop_1 is: -0.