<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.9.0">Jekyll</generator><link href="atbiggie.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="atbiggie.github.io/" rel="alternate" type="text/html" /><updated>2021-11-19T20:20:25+00:00</updated><id>atbiggie.github.io/feed.xml</id><title type="html">Autumn Biggie</title><subtitle>NCSU Statistics Grad Student</subtitle><entry><title type="html">Course Reflections</title><link href="atbiggie.github.io/Course-Reflections/" rel="alternate" type="text/html" title="Course Reflections" /><published>2021-11-19T00:00:00+00:00</published><updated>2021-11-19T00:00:00+00:00</updated><id>atbiggie.github.io/Course-Reflections</id><content type="html" xml:base="atbiggie.github.io/Course-Reflections/">&lt;p&gt;Looking back on my previous posts, not much has changed about my view on
what data scientists do. As I’ve grown in my skillset and knowledge of
different data-handling techniques, ways to access data, as well as how
to create beautiful visualizations and presentations, I’ve been
discovering what it FEELS like to be a data scientist. Reading about the
difference between a statistician and a data scientist is one thing, but
using new tools and learning to THINK like a data scientist is another
thing. I’m excited to keep using the tools I’ve learned and continue
exploring this growing field.&lt;/p&gt;

&lt;p&gt;Over the span of this semester, I have grown not only to view R as an
intuitive language for data science, but as a language I will use for
the rest of my career. It has easy syntax, it’s well documented, and
other coders are constantly sharing ways to do new things and analyze
data. I think the value in R lies in the many different ways to present
my work. R Markdown, Github pages, and ShinyApps provide so many
possibilities for creating engaging reports and interactive
presentations.&lt;/p&gt;

&lt;p&gt;Now that I’ve taken ST 558, I’ll be looking for a job that allows me to
use R frequently, or even as my main coding language. In addition, I’m
excited to start using R for some personal projects outside of school,
as well as to present data for other statistics classes.&lt;/p&gt;

&lt;p&gt;I’ll keep you posted on what I create!&lt;/p&gt;</content><author><name></name></author><summary type="html">Looking back on my previous posts, not much has changed about my view on what data scientists do. As I’ve grown in my skillset and knowledge of different data-handling techniques, ways to access data, as well as how to create beautiful visualizations and presentations, I’ve been discovering what it FEELS like to be a data scientist. Reading about the difference between a statistician and a data scientist is one thing, but using new tools and learning to THINK like a data scientist is another thing. I’m excited to keep using the tools I’ve learned and continue exploring this growing field.</summary></entry><entry><title type="html">My Project 2 Experience</title><link href="atbiggie.github.io/My-Project-2-Experience/" rel="alternate" type="text/html" title="My Project 2 Experience" /><published>2021-10-30T00:00:00+00:00</published><updated>2021-10-30T00:00:00+00:00</updated><id>atbiggie.github.io/My-Project-2-Experience</id><content type="html" xml:base="atbiggie.github.io/My-Project-2-Experience/">&lt;h2 id=&quot;automation-predictive-modeling-and-partners-oh-my&quot;&gt;Automation, Predictive Modeling, and Partners? Oh my!&lt;/h2&gt;

&lt;p&gt;Usually, I’m not too thrilled to work with a partner on a project. I’ve
always been the one to end up with the heaviest workload, often because
by eye for detail allows me to catch mistakes when reviewing the group’s
work. I’m always nervous that my partner will do a lousy job and I’ll
have to put in twice the effort.&lt;/p&gt;

&lt;p&gt;This project was far from that scenario! For this project, my partner
Ryan Bunn and I analyzed a news popularity dataset, producing multiple
files reporting analyses for each of the six different data channels:
lifestyle, entertainment, social media, business, tech, and world. We
went through the process of reading in the dataset, data manipulation
and variable creation, summary table creation, data visualization, and
model fitting using linear regression, random forest, and boosted tree
methods. At the end of each document, a “best model” was declared.&lt;/p&gt;

&lt;p&gt;It was a pleasure to collaborate with Ryan on this project and I
appreciate the opportunity for my view of group work to change. There is
nothing that I would change about the process or the product of this
project. Throughout the development process, each of our tasks as
collaborators were clear and each person completed them in a timely
manner. Communication between Ryan and I was always thorough and clear,
allowing for a smooth workflow.&lt;/p&gt;

&lt;p&gt;The most difficult part for me was getting excited about this data.
During my last project, it was easy to visualize trends in the data
during graph creation. However, this dataset was more complex, having
many observations, interactions, and relationships between variables
that were less pronounced. Exploring the variables absorbed a lot of my
time because it was difficult to find variables that created an
interesting scattterplot, histogram, etc.&lt;/p&gt;

&lt;p&gt;My biggest takeaway from this project was the importance of doing model
comparison on a test set. Sometimes the random forest or boosted tree
model would appear to perform better on the training set we created, but
one of the linear regression models would rise to the top when tested on
the test set. This was surprising to me, but it exposed the value of
having a test set to accurately measure the efficacy of each model.&lt;/p&gt;

&lt;p&gt;Overall, my experience with this project was a pleasant one and I’m
grateful to have been paired with someone so easy to work with. I’m
excited to tackle the next project!&lt;/p&gt;

&lt;p&gt;Visit our Project 2 Repository
&lt;a href=&quot;https://github.com/atbiggie/Project2&quot;&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Visit the Project 2 &lt;a href=&quot;https://atbiggie.github.io/Project2&quot;&gt;landing page
here&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Automation, Predictive Modeling, and Partners? Oh my!</summary></entry><entry><title type="html">Reflections On My First R Project</title><link href="atbiggie.github.io/Reflections-On-My-First-R-Project/" rel="alternate" type="text/html" title="Reflections On My First R Project" /><published>2021-10-04T00:00:00+00:00</published><updated>2021-10-04T00:00:00+00:00</updated><id>atbiggie.github.io/Reflections-On-My-First-R-Project</id><content type="html" xml:base="atbiggie.github.io/Reflections-On-My-First-R-Project/">&lt;p&gt;Today I finished my first project in RStudio. I built a GitHub page
through R that teaches visitors how to access an API, building
user-friendly functions in the process, as well as performing
exploratory data analysis.&lt;/p&gt;

&lt;p&gt;When I first glanced at the Project instructions, I was both excited and
intimidated. The estimated amount of time seemed like a lot! However, I
was glad to be able to choose the API I wanted to access using a
provided list. Initially I chose the &lt;a href=&quot;https://covid19api.com/&quot;&gt;Covid-19
API&lt;/a&gt;, but became frustrated when a network
error lasted all day, putting the project on hold. Since I had only
written a function to access the API up to that point, I decided to
start over and choose the OneCall portion of the &lt;a href=&quot;https://openweathermap.org/api/one-call-api#data&quot;&gt;OpenWeather
API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The longer I spent coding my way through this project, the more I
enjoyed the process of function writing, debugging, discovering trends
in the graphs I created, and committing my thoughts to the R Markdown
notebook. The most difficult part of the process was getting over the
intimidation of using the render() function to render the document
instead of using the knit button, as well as having to add, commit, and
push my changes regularly to GitHub. In retrospect, these are both very
easy processes that didn’t take much time or brain power, but the fact
that I couldn’t visualize how these processes function made both tasks
seem difficult at first.&lt;/p&gt;

&lt;p&gt;As far as the logic and programming, I didn’t have much trouble. The
hardest part was probably looking up new functions for data cleaning as
well as making decisions about what contingency tables, numerical
summaries, and plots I wanted to make. However, that was also the most
fun. Imagining interesting comparisons between variables and then
working to put together the necessary code to create the plot built
suspense because I was excited to see the result.&lt;/p&gt;

&lt;p&gt;If I were to do this project over again (which I may do it just for fun
with a different API), I would become more familiar with what functions
and options are allowed to be used when rendering a github document in R
Markdown. I ran into a few issues with leaflet() and a few other
functions that easily work in HTML output, but are more tricky to
include in GitHub pages. Becoming familiar with these limitations would
save me hours of Google searches in the future.&lt;/p&gt;

&lt;p&gt;Overall, I really enjoyed this project, including the learning process
as well as the final vignette I created. I’m looking forward to tackling
more stuff like this in the future!&lt;/p&gt;

&lt;p&gt;To check out my project, visit &lt;a href=&quot;https://atbiggie.github.io/Project1/&quot;&gt;https://atbiggie.github.io/Project1/&lt;/a&gt;&lt;br /&gt;
To see my Project 1 Repository, visit
&lt;a href=&quot;https://github.com/atbiggie/Project1&quot;&gt;https://github.com/atbiggie/Project1&lt;/a&gt;&lt;br /&gt;
To see the repository that hosts my blog, visit
&lt;a href=&quot;https://github.com/atbiggie/atbiggie.github.io&quot;&gt;https://github.com/atbiggie/atbiggie.github.io&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Today I finished my first project in RStudio. I built a GitHub page through R that teaches visitors how to access an API, building user-friendly functions in the process, as well as performing exploratory data analysis.</summary></entry><entry><title type="html">Programming Background</title><link href="atbiggie.github.io/Programming-Background/" rel="alternate" type="text/html" title="Programming Background" /><published>2021-09-07T00:00:00+00:00</published><updated>2021-09-07T00:00:00+00:00</updated><id>atbiggie.github.io/Programming-Background</id><content type="html" xml:base="atbiggie.github.io/Programming-Background/">&lt;h2 id=&quot;my-not-so-hot-take-on-r&quot;&gt;My (not so hot) take on R&lt;/h2&gt;

&lt;p&gt;I wasn’t looking forward to learning R…&lt;/p&gt;

&lt;p&gt;When I started my first R class almost two years ago, I wasn’t all that
excited about it. I was more comfortable in SAS and had dabbled a bit in
Python, and I was content with limiting my coding knowledge to those two
languages. R just looked… ugly?&lt;/p&gt;

&lt;p&gt;After I got over my initial disgust toward the syntax, I realized that
the same nasty looking syntax was surprisingly very easy to learn. The
more capable I became as an R programmer, the more I realized that I
could perform many of the same analyses and render the same plots that I
could in SAS, only in much fewer lines of code and often with better
graphics.&lt;/p&gt;

&lt;p&gt;I especially appreciate the capability to create nicely formatted html
pages as well as interactive graphics. Although I miss the visual appeal
of a detailed and hierarchically formatted SAS proc step, I can finally
say that I prefer R as my primary coding language because of its ease of
use, flexibility, and intuitive (albeit ugly) syntax.&lt;/p&gt;

&lt;h2 id=&quot;example-r-markdown-output&quot;&gt;Example R Markdown Output&lt;/h2&gt;

&lt;div class=&quot;language-r highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;plot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;../images/example%20graphics-1.png&quot; alt=&quot;&quot; /&gt;&lt;!-- --&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">My (not so hot) take on R</summary></entry><entry><title type="html">What is a Data Scientist?</title><link href="atbiggie.github.io/DataScience/" rel="alternate" type="text/html" title="What is a Data Scientist?" /><published>2021-08-17T00:00:00+00:00</published><updated>2021-08-17T00:00:00+00:00</updated><id>atbiggie.github.io/DataScience</id><content type="html" xml:base="atbiggie.github.io/DataScience/">&lt;p&gt;As discussed in many of the articles below, defining what makes a data scientist seems to be quite complicated. I’ve always thought of them as being rebranded statisticians, but with a more current name. However, I’m learning that there are some skillset and responsibility differences that set data scientists apart from their fellow statisticians. Being able to handle data with thousands of variables and millions of lines of code is one skill that data scientists must master and that statisticians can avoid. Presumably, being able to manage that much data requires superior coding skills in multiple languages and maybe some knowledge of artificial intelligence. Data science seems to be about taking massive amounts of data and using statistics and dynamic code to learn from it as efficiently as possible. Data scientists are who drive the world’s most impactful business decisions for companies like Google and Facebook, while the word “statistician” evokes images of smaller clinical trials and carefully planned experiments or surveys. Data scientists know how to make sense of data sets that never stop growing.&lt;/p&gt;

&lt;p&gt;Although the traditional role of statisticians may be based more in statistical theory than in programming, I think the profession is evolving to meet the current big-data-driven world’s needs. Statisticians still have a massive knowledge base of statistics and mathematics, but many educational programs are shifting toward more of a focus on programming so that their graduating statisticians have a better chance of qualifying for those data science jobs. More classes about artificial intelligence and developing strong programming skills are popping up in course catalogs to bolster theory-based statisticians with the computer skills needed to use their statistics knowledge efficiently and on a large scale.&lt;/p&gt;

&lt;p&gt;As for myself, I believe I’m a statistician who is gaining the skills necessary to enter the world of data science. Most of my education has been about statistical theory and practice, but I’ve taken more recent classes that have allowed me to gain a strong knowledge base in SAS, dip my toes into SQL, and dive deep into R. I think it’s honing these skills, as well as exploring deep learning and artificial intelligence, that will prepare me to walk the common ground between statistics and data science.&lt;/p&gt;

&lt;p&gt;~ Autumn&lt;/p&gt;

&lt;p&gt;https://medium.com/odscjournal/data-scientists-versus-statisticians-8ea146b7a47f
https://www.springboard.com/blog/ai-machine-learning/machine-learning-engineer-vs-data-scientist/
https://www.simplilearn.com/data-science-vs-data-analytics-vs-machine-learning-article
https://mixpanel.com/blog/this-is-the-difference-between-statistics-and-data-science/&lt;/p&gt;</content><author><name></name></author><summary type="html">As discussed in many of the articles below, defining what makes a data scientist seems to be quite complicated. I’ve always thought of them as being rebranded statisticians, but with a more current name. However, I’m learning that there are some skillset and responsibility differences that set data scientists apart from their fellow statisticians. Being able to handle data with thousands of variables and millions of lines of code is one skill that data scientists must master and that statisticians can avoid. Presumably, being able to manage that much data requires superior coding skills in multiple languages and maybe some knowledge of artificial intelligence. Data science seems to be about taking massive amounts of data and using statistics and dynamic code to learn from it as efficiently as possible. Data scientists are who drive the world’s most impactful business decisions for companies like Google and Facebook, while the word “statistician” evokes images of smaller clinical trials and carefully planned experiments or surveys. Data scientists know how to make sense of data sets that never stop growing.</summary></entry></feed>