After 25 Years of Coding in C And Perl.
As an independent author/researcher, there is of course nothing in my “job description” that says I should code in Python (or any other language). Yet for a long time, I thought coding in Python would help me a lot. It would mean more readers, and thus eventually, more revenue. At one point I advertised a job position, looking for people to translate my Perl scripts into Python. I thought doing it myself would take a lot of time with a long learning curve.
I was wrong. I am glad that I jumped into the Python bandwagon. It was much easier than I thought. Of course I still have to learn plenty of things. Here I relate my experience, learning Python on my own, without attending any class, without reading any book on the topic. I hope my experiment will help people do things they hesitate to do, be it learning a new language or anything else. Some readers mentioned that what I did inspire them to move forward with some projects, rather than following inertia.
First, I do not recommend learning it entirely on your own. You may save some time and money, but learning from a reputable instructor or book will guide you on the right track. In my case, I am a self-learner. Any class I attended in the past was a waste of time, and progressing too slowly. But I am the exception rather than the rule. When I published my Python code, written in a unique style, I asked for criticism. I received a lot of feedback, which I share here. First, people said Python is not designed to reflect a coder’s personality, but should be written in a “standard way” to make it easy to reuse by other coders. And while my code is based on what I’ve found online, it is no obvious to browse the Internet to find good code snippets, versus bad ones or obsolete practices, when you start from scratch.
That said, I did not really start from scratch. I have a long experience of coding in various languages, and scripting languages in particular. It definitely helps. Though it also hurts, as some of my old habits conflict with the way Python was designed. On the plus side, I started with the latest version of Python.
It started about two weeks ago. I was working on a new fuzzy regression technique, math-free, model-free, yet providing prediction intervals (without bootstrap or resampling). At 2am on a Sunday, I could not sleep, haunted by math problems. I went to my desk and decided to try to install Python. Eighteen hours later, my first Python script was life and working properly. It deals with this fuzzy regression method. You can find the details, including the source code, here.
I work on a Windows machine, with the Cygwin environment (some kind of Unix environment for Windows). Installing Python and running it under Cygwin was easy, and very similar to Perl. I quickly realized that I would benefit from using libraries like Numpy or Pandas. Installing Pandas with the command did not work on Cygwin. I tried doing it in a Windows console, the equivalent of Unix shell for Windows. It allows you to enter command-line statements. It worked!
If Python is the first language that you learn, you won’t experience the problems below. But if you come from Perl or other programming languages, be prepared for some surprises. While perplexing, they are easy to overcome.
First, there is no explicit “end” when you write a loop, say a loop. The end is dictated by the indentation. It makes for shorter code, easier to read. But you need to be very strict with indentation. I discovered this feature on my own. Then the use of comma to separate variables does not work. I had a hard time figuring out why, until I realized there is something called “Tuple” in Python. Adding commas create and define a Tuple, not a list of variables. Once you are aware of it, it is not an issue.
Also, you can not put all your functions (subroutines in Perl) at the bottom of the code. A function must be defined before the first call, in your code. Debugging is very easy though, the error messages you get are more helpful than the ones I get from Perl. Using libraries like is also surprisingly easy and intuitive.
Perhaps the most surprising feature is that an array assignment such as (on a void array) won’t work. You have to write instead. It is a reminder that Python must do some memory allocation in the background. Kind of like coding in C. In Perl this is transparent. I wish the interpreter would automatically recognize and turn it into . What I did not realize initially, is that arrays don’t really exist in Python: they are treated as lists.
Now a few positive surprises. I was expecting Python to be very strict about variable types. I thought you would have to perform type casting all the time. Of course it is more strict than Perl, but less than C. All in all, it was not an issue. And type casting makes for more robust code.
The scope of a variable is more flexible than I thought it would be. You don’t need to pre-declare all the variables, and certainly not at the top of your code. A local variable (within a function) with the same name as a global variable, won’t overwrite the global variable. In some sense, it is a bit like Perl if you use the and directives, which is good programming practice.
I used three “array” initializations in my code, on three separate lines: , , . I was told I could write instead, but it does not produce the same results. I still have to figure out why, but it does not bother me. Finally, I though I would have a hard time writing a function that returns multiple arguments. It actually worked on my first attempt, without problems. The function just returns a “Tuple”, though at that time I did not know there was something called a Tuple.
A reader pointed to re.sub(…) for processing regular expressions. Another one suggested to conform to the Python PEP8 style standards. More specific comments:
Should you hire a programmer with no Python experience, for a Python job? This a question worth pondering, if you can’t find candidates. In 18 hours, I made a lot of progress. My next step is playing with hash tables, called dictionaries in Python. I used them a lot for NLP, in Perl. And then try the AV video library and some graphics libraries like GD. I used them in Perl and R: they are written in C++, and I have no doubt, available in Python as well.
My Python script fuzzy.py and the input/output data sets are available on my GitHub repository. I wrote a technical document explaining the method and the code. It is entitled “Fuzzy Regression: A Generic, Model-free, Math-free Machine Learning Technique”, and available from here. Finally, I’ve found the book “Think Python” by Allen Downey, to be a good introductory tutorial for me, as it is rather compact.
Vincent Granville is a machine learning scientist, author, and publisher. He was the co-founder of Data Science Central (acquired by TechTarget) and most recently, the founder of MLtechniques.com.