Well … bit of a gap there. So I spent the last 3 weeks getting further and further behind. And then life got in the way a bit.
Turns out that, for me anyway, I really need to understand Python, pandas, sql, JSON, lambda OpenRefine, regex, command line etc. a lot better. Although you don’t need to be an expert in these for the course (except for a familiarity in Python – which I didn’t have), I believe it’s helpful to understand how all these things slot together. Once I understand that a bit better, I think I’ll be able to get on with the business of actually playing with the data and fully learning and exploring the course – which I love.
I didn’t want to waste any more time Googling code syntax and panicking about falling further behind, and so I’ve deferred the course until October. Although that adds a year to my degree (as I only do one module a year), it gives me six months to get a better handle on all these concepts. And so I’ll use this blog to post about how I’m learning all these things.
It feels strange to have gone from ‘ohmhgod I’m so far behind, the tma deadline is approaching, aargghh!!!!’, to ‘and relax’ in one short phonecall. But it’s definitely the right call for me. I’ve never done anything like this before – I love a challenge, and I’m not a quitter. But I’m also a card-carrying member of the 100% Perfectionist Club, and I simply can’t hand in assignments if they’re not the very best that I can do. And, with a summer’s revision, I will do much better.
This week has been an exercise in catch up, and my lack of experience with Python, SQL and anything to do with databases is slowing me WAAAY down. I should be on section 5 this week. I’ve just finished section 3, and am now trying to do the two TMA questions related to these sections before even starting section 4.
Lessons I have learned this week – copy everything that goes into the Jupyter notebooks into a Word doc. just in case you spend 5 hours working on a TMA problem, only for the notebook (and it’s backup) to fall over and die. Eventually got them back from the checkpoint file, but it took forever to load, and I was in a bit of a despair there for a while.
This week I added the TM351 virtual machine, Jupyter and OpenRefine to my laptop for work, as i don’t think I can install all that software onto a work PC.
Everything fell over on my home PC with the virtual machine, vagrant and the course software so spent a couple of evenings and the forums pulling that back to life.
I’ve been learning about dirty data, data laundering and cleansing.
Practical work: cleansing data in OpenRefine – for example removing extra whitespace, commas from monetary columns, typecasting columns. I’m now learning how to do this in Jupyter using Python pandas libraries.
Oh God, regex is back in my life.. great fun when working, great frustration when I can’t figure it out…
Checked the iCMA and realised I had been panicking that it covered Parts 1-3, when it was only parts 1-2 so managed to get that out the road as I’m halfway through part 3 at the moment. This means I can go to knitting group on Tuesday, so chuffed about that.
I’ve got a migraine and lots on this weekend so it’s a bit of a challenge to keep ahead of things. Mainly quite tired. The jump from level 2 to 3 in addition to the 7 month gap between modules has lead to me finding it difficult to get back into the swing of things. Plus I have my Bohus jumper to finish before Edinburgh yarn Fest next month – priorities!
Started question 1 of TMA01 since that also only needs knowledge of Parts 1-2. Lots of fun putting everything into practice. I can really say I’m enjoying this course a lot. One of the very first that I’m happy to sit working at and don’t notice the time passing. I love coding, but I do get very easily distracted. I’m not sure how I would have passed my school exams if the internet had been around then! But I can quite happily work on this for ages. Let’s hope that continues throughout.
A late start to the week, what with multiple doctor, dentist, hospital appointments, cinema trips and the like. Wednesday already and I’m just getting started on week 2 material having not really finished week 1 fully. Parents here at the weekend mean I’m falling behind, so am going to take a days leave to catch up next week. No point taking it this week or I’ll just do a pre-parent clean instead of studying!
Reading: Cracked open Part 2 – Acquiring and representing data. Learning a bit more about encodings and how important they are in being able to capture data.
Totally confused by numbers for measurements and Stevens’ NOIR. The exercise said 20 mins and I gave it a good 1 hour and 20 mins and then had to move on. Slightly alarmed by reading lists that require me to look up a word in every sentence… no chance to get a train of thought going with that type of reading!
Practical: Also using the Jupyter notebooks to learn what pandas is about – mainly series and DataFrames i.e. making pretty tables with data!
Feeling pretty pleased with myself as managed to figure out what to do to create a JSON Table Schema when I’m still getting to grips with what exactly that is – Netbeans my friends! Save as .json. Looks like HTML type markup.
Lots of new words, software and ideas to get my head round this week; JSON, pandas, OpenRefine, JSONLint, CSV Lint Service. But have to say that it’s pretty cool being able to get CSV data from government websites and make customised tables with it!
So that’s Part 2 pretty much in the bag. Print out Part 3 tomorrow – speed through at breakneck speed as iCMA is due in a week today and is assessing Parts 1-3.
Trying to get to grips with Notebook and iPython this week, while reading the intro to acquiring and representing data.
I’m feeling slightly overwhelmed with the volume of practical coding and reading I’m going to have to do. It’s going along quite slowly due to my lack of experience with Python and Notebook.
Also, although it’s all working at the moment, I’m not really sure I understand how the whole Linux virtual machine is set up. I’ll maybe have to look into that when I have more time.
The reading is interesting though and as I work with statistical data as part of my job, the course text and reading lists are proving useful and relevant.
The dir() command – I actually ‘ooh’d out loud! This is genius and perfect for beginners to Python, especially for collection objects. It lists all the methods and attributes the argument object can use.
And now I know what lambda means = on the fly function!
As the last module of my 3rd year of the degree requires me to blog my progress through the final project, I thought I’d start getting into the habit early.
I’ve just started my first module of the third and final year of my OU journey. Already the difficulty level has increased as I start TM351. Although it says a lot for how far I’ve come that the requirement to know Python and its libraries is not something which throws me into panic. Once you’ve learned one computing language, another one is not so hard. Just remembering which uses semi colons, and which type of bracket goes round which data collection type really…
Anyway – onwards and upwards. There is a lot to accomplish with this module, and my personal life looks to be increasing stress-wise before too long, so best get as ahead as possible while I can.