I have always been good with numbers, when I was at school I did my Maths GCSE a year early and got a B. Unfortunately I had a lot of bad things going on in my life at the time so I didn’t go forward with University at that time. Fast forward a decade or two and now I’m a recent graduate from Hallam University, I went there as a mature student to do a degree in computing and the course involved loads of data related modules.
During my Computing course at Hallam University I spent a lot of time learning about data, databases, data cleaning, data processing, data management, data analytics and data visualisations. The course also involved a number of assignments that included a power point presentations in front of tutors and answering their questions about the work afterwards. Presentations were something I was very comfortable doing.
I learned to collect raw sensor data using a Raspberry Pin (or from a pico using micropython/circuitpython), I learned that it is absolutely essential for every data point to be timestamped in iso format, it guarantees it will be easy to use and re-use the data later on.
I learned how to clean data using python code. I cleaned and formatted data by writing the code to fill in missing data cells so they wouldn’t prevent the data file from being analyzed. Names sometimes need to be formatted so 2 data entries with the same name but one spelled using a capital letter and the other one without are counted equally.
It is also important that data stored in a database is normalized which may require creating link tables, composite primary keys or potentially using the timestamp as the primary key.
I have used AI tools to do Natural Language Processing (NLP) to find out what the most common things people wrote about. I’ve done something recently that involved NLP, I wrote a program that “web scraped” data from a friend’s online blog and then I created a word cloud to visualise what he was blogging about the most.
There are other more detailed ways of analysing language e.g. “bert”, when anyone mentions the term Bert I automatically think of the word “bi-directional” (that’s what the b stands for). “Bert” is an AI/ML tool that looks into the context of where the word is positioned within the sentence. E.g. the word massive could be a positive word or a negative word but we don’t know unless we can find out where it is relative to the other words in the sentence.
If you want live interactive data I have experience building live data dashboards with Node-Red:
I even allowed users to input their comments into the dashboard that were then stored in a database:
I know all about creating a dataframe from raw data like this:
This is the dataframe:
If you are using live data you will probably want to get a the rolling average from the most recent day/week/month/year of data and that’s what I’ve did for the River Project. I used a “deque” to make that happen:
But it’s only possible if each data point has a corresponding timestamp and that’s why i wrote this code, so if someone uses a separate column for the date and the time it will combine them into a single collum:
As you can see from this blog post of mine I want to take away the mysticism from the world of computers. https://legitbainesblog.blogspot.com/2025/10/a-beginners-guide-to-ai-machine.html
So when I’m analysing data I should be able to explain how I got the data, what I did to it to process it quickly and concisely before I tell you the story the data is telling me.
The one presentation I didn’t get right while at Hallam taught me a very important lesson about data visualisations (charts and graphs etc…). To make the presentation accessible for the people you are presenting to you can’t show them everything. For every path you take there is a path not taken. Data analysis is a two way process, as the analyst it is my job to ask you what is the most important thing to you that you want me to find out? Trying to show everything ends up showing nothing, the reason for doing data analysis is to find out what the story is that the data is trying to tell us and to tell that story I need to separate the signal from the noise by showing the important stuff.
Power Bi, it wasn’t in my course but I have used it at home and It looks like a great tool! It made webscraping really easy but I’ve done webscraping with other tools as well it’s never been essential for my own projects. It was such a use friendly tool that I’m more than happy to use it more and more. However for me the most important online tool is Google.Colab because it’s cloud based if I use it to analyse my data I won’t end up crashing my computer!
We have a code of ethics that's written by The BCS and that was a big part of our course including for my dissertation. The very first thing it says is that we should use the power of computers to make the world a better place. I want to collect data and analyse it to solve problems and make the world a better place.
There are always GDPR considerations, including in my dissertation so my data has to be anonymised before I analyse it. However everyone will have your own rules and regulations that I’m currently unaware of but fortunately I have got a photographic memory so if you can give me anything you want me to read about your rules and regs I can assure you I will read it and remember it.
I’m really into reading to the end of things so for me it’s important to find out how we are getting the data in the first place so I will always be curious and ask more questions than I really need to.
Could there be self sorting? What about all the people who didn’t reply to a survey how do we find out what’s going on with them? Can we try sending a survey out in a different format so we can get more responses?
If I get a data analyst job I want everyone I’m working with to understand where the data I’m using is coming from and the reasons behind my analysis of it. As time goes on I want build trust and respect from everyone I work with for what I’m capable of contributing and then I can gradually play a bigger and bigger role in how data is collected.. Data analysis is not an end in itself, it is my job to find ways of helping the real experts and professionals be more efficient when they do their jobs, the people who turn up in person to fix stuff are the people who matter the most and I want them to be as effective as possible.
My interpersonal skills and ability to communicate with people from different social classes, ethnicities, genders and age demographics is also very good. I’ve recently been a key member of the team of people trialing the skips in Page Hall and it’s been a great project to be part of because I’ve been able to get on well with everyone in that community and built a really good working relationship with my fellow coworkers, supervisors and managers. (I can provide references for you to contact).
I’ve got the knowledge, the tools and the work ethic to be really good at this job. All I need is to be given a chance to prove it.
During my Computing course at Hallam University I spent a lot of time learning about data, databases, data cleaning, data processing, data management, data analytics and data visualisations. The course also involved a number of assignments that included a power point presentations in front of tutors and answering their questions about the work afterwards. Presentations were something I was very comfortable doing.
I learned to collect raw sensor data using a Raspberry Pin (or from a pico using micropython/circuitpython), I learned that it is absolutely essential for every data point to be timestamped in iso format, it guarantees it will be easy to use and re-use the data later on.
I learned how to clean data using python code. I cleaned and formatted data by writing the code to fill in missing data cells so they wouldn’t prevent the data file from being analyzed. Names sometimes need to be formatted so 2 data entries with the same name but one spelled using a capital letter and the other one without are counted equally.
It is also important that data stored in a database is normalized which may require creating link tables, composite primary keys or potentially using the timestamp as the primary key.
I have used AI tools to do Natural Language Processing (NLP) to find out what the most common things people wrote about. I’ve done something recently that involved NLP, I wrote a program that “web scraped” data from a friend’s online blog and then I created a word cloud to visualise what he was blogging about the most.
There are other more detailed ways of analysing language e.g. “bert”, when anyone mentions the term Bert I automatically think of the word “bi-directional” (that’s what the b stands for). “Bert” is an AI/ML tool that looks into the context of where the word is positioned within the sentence. E.g. the word massive could be a positive word or a negative word but we don’t know unless we can find out where it is relative to the other words in the sentence.
If you want live interactive data I have experience building live data dashboards with Node-Red:
I even allowed users to input their comments into the dashboard that were then stored in a database:
I know all about creating a dataframe from raw data like this:
This is the dataframe:
If you are using live data you will probably want to get a the rolling average from the most recent day/week/month/year of data and that’s what I’ve did for the River Project. I used a “deque” to make that happen:
But it’s only possible if each data point has a corresponding timestamp and that’s why i wrote this code, so if someone uses a separate column for the date and the time it will combine them into a single collum:
My dissertation was all about how to collect and send data from a remote location with no Wifi and no mains electricity. I used the 4G network and something called MQTT on a device so small that it can be powered by an AA battery.
As you can see from this blog post of mine I want to take away the mysticism from the world of computers. https://legitbainesblog.blogspot.com/2025/10/a-beginners-guide-to-ai-machine.html
So when I’m analysing data I should be able to explain how I got the data, what I did to it to process it quickly and concisely before I tell you the story the data is telling me.
The one presentation I didn’t get right while at Hallam taught me a very important lesson about data visualisations (charts and graphs etc…). To make the presentation accessible for the people you are presenting to you can’t show them everything. For every path you take there is a path not taken. Data analysis is a two way process, as the analyst it is my job to ask you what is the most important thing to you that you want me to find out? Trying to show everything ends up showing nothing, the reason for doing data analysis is to find out what the story is that the data is trying to tell us and to tell that story I need to separate the signal from the noise by showing the important stuff.
Power Bi, it wasn’t in my course but I have used it at home and It looks like a great tool! It made webscraping really easy but I’ve done webscraping with other tools as well it’s never been essential for my own projects. It was such a use friendly tool that I’m more than happy to use it more and more. However for me the most important online tool is Google.Colab because it’s cloud based if I use it to analyse my data I won’t end up crashing my computer!
We have a code of ethics that's written by The BCS and that was a big part of our course including for my dissertation. The very first thing it says is that we should use the power of computers to make the world a better place. I want to collect data and analyse it to solve problems and make the world a better place.
There are always GDPR considerations, including in my dissertation so my data has to be anonymised before I analyse it. However everyone will have your own rules and regulations that I’m currently unaware of but fortunately I have got a photographic memory so if you can give me anything you want me to read about your rules and regs I can assure you I will read it and remember it.
I’m really into reading to the end of things so for me it’s important to find out how we are getting the data in the first place so I will always be curious and ask more questions than I really need to.
Could there be self sorting? What about all the people who didn’t reply to a survey how do we find out what’s going on with them? Can we try sending a survey out in a different format so we can get more responses?
If I get a data analyst job I want everyone I’m working with to understand where the data I’m using is coming from and the reasons behind my analysis of it. As time goes on I want build trust and respect from everyone I work with for what I’m capable of contributing and then I can gradually play a bigger and bigger role in how data is collected.. Data analysis is not an end in itself, it is my job to find ways of helping the real experts and professionals be more efficient when they do their jobs, the people who turn up in person to fix stuff are the people who matter the most and I want them to be as effective as possible.
My interpersonal skills and ability to communicate with people from different social classes, ethnicities, genders and age demographics is also very good. I’ve recently been a key member of the team of people trialing the skips in Page Hall and it’s been a great project to be part of because I’ve been able to get on well with everyone in that community and built a really good working relationship with my fellow coworkers, supervisors and managers. (I can provide references for you to contact).
I’ve got the knowledge, the tools and the work ethic to be really good at this job. All I need is to be given a chance to prove it.
Comments