Massive Internet Attack Floods the World with Fake Data

Reddit is now at the center of this attack that impacts millions of top domains (most of the Internet) since November 30. While Reddit appears at first glance as the perpetrator, it is actually the victim. This “behind the scene” scheme run from Russia generates huge amounts of fake traffic

Analysis of 2 Million Hijacked Passwords (in Python)

Posted by Jianhua Li on GitHub. This was proposed as a data science project on Data Science Central, to challenge your data science skills on a real data set. Below is an overview.  Basically one should try to answer the following three questions: What are the most common patterns found in passwords? Based on

5 Business Models That Suit the Startups!

If there is an idea or a concept that the people around you are begging to turn into a business, why not go for it? Not only will you be doing what you love to do, but you will also be bringing in the green. A business model is an

The Environmental Cost of Big Data: It’s Higher Than You Think

One of the major selling points of collocating a business’ servers in a data center is the reduced energy consumption. Small businesses have long been sold on the idea of reducing the in-house power resources necessary to operate the network, which in many cases had moved well beyond a simple

Bring Shadow BI in from the Cold

This is contributed by Ilan Hertz, Head of Digital Marketing at Sisense.  Chances are, you have shadow BI operatives in your organization. Yes, really, and probably many more than you think. The urge to stop shadow BI at all costs might be strong, but wait! Before you set your phasers

Machine learning as a service ? Might lose sleep over this !

    This post is ‘not’ intended to teach people how to use popular predictive modelling APIs for free. Although, to your surprise, this isn’t a far fetched possibility. Trained Machine learning models are basically a function that maps feature vectors to the output variable. Upon querying with a test

Why so many Machine Learning Implementations Fail?

A recent article in Techcrunch describes Twitter and Facebook issues: algorithms unable to detect fake news or hate speech. I wrote about how machine learning could be improved, and what can make implementations under-perform – or not perform at all. And a colleague shared with me an article about how

Four Great Pictures Illustrating Machine Learning Concepts

Four pictures were posted recently on Data Science Central, and have immediately become popular. They are designed as one-page tutorials on some specific (basic or advanced) topics. Click on the links below to find those related to the subjects that you are interested in.  Four Great Pictures Illustrating Machine Learning

Python, Machine Learning, and Language Wars. A Highly Subjective Point of View

Guest blog by Sebastian Raschka. Sebastian Raschka is the author of the bestselling book “Python Machine Learning.” As a Ph.D. candidate at Michigan State University, he is developing new computational methods in the field of computational biology. Sebastian has many years of experience with coding in Python and has given several seminars