Python Vs. R: The Battle Influenced By Machine Learning

A couple of years ago, I decided to proactively manage my health by having the most comprehensive exam in my life done by a company called Health Nucleus, based in San Diego. Health Nucleus uses genome sequence analysis, machine learning and $500 million worth of imaging equipment in an effort to reveal not only the complete picture of your health today but also your long-term risk for disease.

I was in the facility for about six hours, during which time it felt like a million images, movies and heart echocardiograms were taken. Amazingly, I received an in-depth analysis beyond anything I had imagined by the end of the day. Thanks to its reliance on machine learning, what I learned from the exam at Health Nucleus wasn’t just an interesting experience. It was life-changing, giving me the roadmap to living my healthiest life and what specific things I would need to watch out for to achieve that goal.

Health Nucleus stands to revolutionize the practice of medicine with this kind of technology, but the pricing is too high right now for many people. The bigger question is: How fast will it develop?

We can’t answer this with absolute certainty at the moment. After all, it took electricity about 30 years to propagate throughout society. A large percentage of machine learning projects aren’t immune to failure. Why do they fail? Partially due to people trying to use new technology the same way they’ve used old technology and partly because they don’t have nearly enough data required for machine learning.

And in machine learning, you need a lot of data.

Consequently, a tug-of-war between old technology and machine learning is always playing out before us. To illustrate this dynamic, a couple of programming languages on opposite sides come to mind.

One of the people I follow on LinkedIn is a professor who teaches at Oxford University who recently wrote about the programming language Python killing R, an open-source software. R is a very powerful tool for statisticians, who have loved it and created all kinds of libraries of it.

Typically, R requires very few parameters and the data you need to consume is relatively limited.

When machine learning arrived, it required systems with billions of parameters. The problem? Statisticians don’t use billions of parameters – they use closer to four parameters. If you’re working as a statistician and have more than four parameters, you’ll probably enter into what’s referred to as overfitting.

To elaborate on overfitting, let’s say a finance professional is looking to forecast the future from a financial perspective. So he studies specific theories and arrives at four parameters based on history to describe the present-day situation. He may have predictions for training data by referring to the past. However, still, he doesn’t receive any new data to feed into his model and, as a result, is not in the very best position to forecast conditions for the financial future. That’s overfitting – his statistical model is too closely aligned to a minimal set of prior data points and consequently cannot refer to anything else but that data set.

How does that change in a machine learning environment? Machine learning provides a more custom mathematical model representing the data fed in the training session. 

Talman Advantage #6: The Technical Expertise Clients Highly Respect

How many account managers within a recruiting firm have technical PhDs and MSs? Not many. Yet, you’ll find several of them at Roy Talman & Associates, which our clients in the technological space have come to highly respect over the course of 30+ years. No wonder they respond quickly in real time. And when we suggest the creation of a new position just for you, they seriously consider our suggestion at a minimum and frequently call us to discuss further.

See yourself represented from a higher place right from the very beginning. Talk to Talman first.

A great way to explain this is to think about your brain, in which certain cells want to recognize an image – let’s say, one of your children. These cells are dedicated to making that connection so you can recognize your son or daughter. Then another child enters the picture and some other cells recognize them. And once this connection and recognition is made, it stays in place. Your cells have “learned” to identify specific people and that is going to remain. In the most practical terms, when you have 4 billion parameters instead of four, some things will be learned by one neuron or one node in the network. The futurist and inventor Ray Kurzweil says a sort of self-organization occurs where certain parts of the brain recognize different patterns. Machine learning processes information the same way, triggered by certain events.

Statisticians Vs. Neural Net Proponents

As we’ve mentioned, statisticians are fond of their tools and the parameters that come with them, as limited as they may be. But when there is a requirement for a model to have 100 billion parameters, how will that work using traditional statistics models? It’s not. The language of R isn’t nearly as user-friendly as Python.

That’s why proponents of machine learning decided to build their own system using Python and libraries inside Python (of which there are millions of them). They see themselves as delivering a “new form” of statistical analysis that isn’t limited by past data sets. Instead, it views images and assigns a level of probability that the image is accurate to what Python believes it is, whether 100% or 75%.

By this definition, machine learning is statistical. And by now, there could be 1 billion statistical machines with different levels of “confidence” on whether a probability is accurate.

Hiring a highly advanced candidate in the technical realm can feel a lot like a moving target, especially when it comes to programming languages. But at Roy Talman & Associates, we pride ourselves on understanding what a candidate knows from a technical standpoint and how well they’ll respond to various challenges in a real-life situation. That’s why we initiate challenges and tests to ensure that you’re choosing from a pool of superior, passionate candidates who are great on paper and even better in practice after they join your culture. This isn’t what you get from any recruiter. But it’s always part of the offering when you Talk To Talman First.