Contents
Smart data scientists don’t just solve big, hard problems; they also have an instinct for making big problems small.”
― D.J. Patil, Data Jujitsu: The Art of Turning Data into Product
That’s a good quote right there, but whoever has taken up UTD’s MIS 6359 course like me knows very well that it’s easier said than done. MIS 6359 is Advanced Statistics for Data Science, and those who have an affinity for statistics can nail it pretty well.
Suppose you’re planning on taking it for your master’s; welcome to Tier One of crunching numbers. As someone who majored in computer science during her undergrad and had minimal interaction with math, I initially was a little intimidated by it. However, things began to change when I paid attention to the details.
You know what they say, “If you ask a stupid question, you may feel stupid; if you don’t ask a stupid question, you remain stupid.” It took a lot of help from my friends and asking (possibly) stupid questions regarding the lectures and the course, in general. Understanding the concepts became comparatively easy at a slow but steady pace.
I’m not a seasoned scholar in advanced statistics now, don’t get me wrong, but I have grown to respect the subject and have fun with it. So now I’m sharing some knowledge straight from the professor, Dr. Rasoul Ramezani, to help students like myself hopefully.
Q: Is this course for everyone?
This course is best suited for students who possess preliminary knowledge of statistics or have taken at least one statistics course during their undergrad. The course focuses on basics initially but quickly descends into more complex levels of difficulty, which makes it challenging.
Advanced Statistics for Data Science is an excellent course for those interested in pursuing careers as data analysts or researchers in any area dealing with data, from social studies to finance. The contents of this course equip students with enough knowledge to help them apply these concepts with any real-life data.
Q: Since data science is a vague term, how would you define it?
As a professor teaching this course, I consider data science the heart of the process wherein complex and messy data is transformed into user-friendly, interpretable, and understandable outputs that make it easy for humans to comprehend.
Q: Is there anything you would advise the students to be prepared with?
I suggest they have a good understanding of calculus, algebra and get the hang of the basics of understanding the relationship between variables in statistics. If the students are knowledgeable in the mentioned concepts, they’re good to go.
Q: What should a student focus on more – statistics or programming?
The students first need to develop strong proficiency in the basics of statistics ab initio and make their way up to becoming good statisticians. In the meantime, they also need to learn programming in different platforms such as RStudio, MATLAB, SAS, etc. Being adept in programming without understanding the meaning of the inputs and outputs is useless. That is why it’s important to start understanding the basics of statistics and work your way through its depths with practice.
Q: What is the hardest part of this course?
The most challenging part of this course is understanding the relationship between the theory explained in class and its application to real-life scenarios. Students can overcome this with practice and mindfulness when solving problems related to the concept.
Now that you’ve read words of wisdom from someone who has been practicing this subject for years, I hope some of your doubts are put to rest. You can ask further questions or seek guidance from Professor Ramezani by emailing him at Rasoul.Ramezani@utdallas.edu.