My practicum was a data engineering summer internship at A.M. Best Rating Services in 2023. I was a part of the summer internship program with 9 other interns, and it was in person 2 days of the week and remote for the rest (with additional remote days like during the large Canadian forest fire which made it unsafe to drive). We also did intern team building activities, like park cleanup. The workplace was very professional but laid back in a way. My supervisor was Rohit Motiani, who guided me through the internship. I got this internship through a referral from a family friend, so I would recommend asking around to see if anyone has an opportunity in your field of interest.
As for what I did there, I was learning and working with Apache Airflow which is a data orchestrator. It is mainly used to create ETLs (extract, transform, and load) which are extremely dynamic and customizable. My job was to figure out how it could be applied to a hybrid cloud setup, and it was determined that it could be the bridge between the on-premises databases and the cloud setup. In addition, I learned all about databases, SQL, pandas, and using python to connect everything together using its DBAPI. All of these topics are something that pretty much every computer science major will learn at some point in their careers, so it was great that I got it out of the way. I learned about the ways that their data is stored, how it went together, and why it was the way it was. Even though it was just data for one company, similar practices are used everywhere. However, I had to spend around 4 weeks of the internship just learning these subjects, which slowed down my progress greatly.
In addition, I worked at A.M. Best again in the fall 2023 semester as a part time worker. This time I worked under Matt Coppola who also helped me grow greatly. During this time, however, I learned a lot about data engineering on the cloud using the Azure Cloud Platform. I created my first cloud data pipelines using Azure Data Factory and learned to interact with the data lakes and databases in tandem. I would also recommend working really hard and getting noticed, since there's the possibility of something like this happening.
I also gave a demonstration of a lot of the tasks I worked on, such as Airflow ETLs and Azure Data Factory pipelines. Halfway through my internship after I finished learning these technologies, I gave a small presentation on everything I had learned. These were presentations mostly to the data engineering team and once to many members of the company at the final intern presentations.
Beyond the internship, I learned about the usefulness of computing in the real world. A lot of the data collected would have previously likely been collected on paper, but thanks to data engineering, it can be more organized, easily accessible, and much more scalable. I also got a taste of what it's like to work at a job, despite this only being an internship/part-time. However, I also learned that my interests lie more in the use and analysis of that data (data science) rather than data engineering. I understand the importance and intricacies of data engineering and associated workflows, but I believe that my future career would revolve around the data science side of things. On top of that, I think I would like to go into postgraduate studies. I also am interested in this after contacting a professor to help with a research project in Spring 2024, which I will be continuing into the summer as an intern.