“Unexpected” challenges applying machine learning in fraud detection

A story came out recently about super-sophisticated self-driving cars being easily duped by relatively simple tricks used by some hackers. On the surface it seems to be shocking, as the brightest minds of top engineering companies have been working hard on making the promise of self-driving cars a reality – and triggering a true revolution in our day-to-day lives. In fact, this is hardly surprising. The algorithms, and the signals they were relying on, were probably never trained to resist an active sabotage. They were merely trying to replace human beings in routine activities, just like they do in other areas such as language translation or image recognition. In ‘non-adversary’ circumstances, the performance of the algorithm can be steadily improved over time. Once you achieved a certain reasonable threshold (e.g. detecting objects in pictures), you are not going to slip back into not recognizing them even if you stop adding more features or bigger training data sets.

With fraud you are dealing with a different animal – the patterns you are trying to detect are actively trying to hide from you. Your successful detection yesterday doesn’t guarantee the same performance tomorrow. As famous security expert Bruce Schneier once noted “Attacks never get worse, they only ever get better.” And they do evolve, change, adapt and advance in quite unexpected ways.

Does this mean machine learning is ultimately powerless against the human creativity directed against it? Of course not. It is is being successfully used to detect online fraud in top-tier financial and business institutions, some with spectacular results. Not to mention select human-vs-machine clashes, such as games of chess or Jeopardy!, where ML algorithms actually proved to beat the best in kind human experts. However, to achieve consistent results in practice, one should keep the following in mind:

  • Continuous learning that relies on fresh data is imperative. You are essentially teaching the algorithm to detect a constantly moving pattern. The models will easily degrade over time if they stay intact.
  • Consistent investment into ever-more sophisticated features is also non-negotiable. Throwing more of the same data (going back into history) is not going to help much. Squeezing more juice from the same data has its natural limits, too. The world constantly evolves and so should your features (in the self-driving car hacking example, the ‘feature’ itself was actually compromised)
  • Typically, no one solution would suffice to cover the entire (again, constantly evolving) fraud landscape – thus proper investment is necessary into “plumbing” which would enable complex execution plans such as multi-tier decisioning, running in parallel, and applying different modeling techniques.

Back to the self-driving cars. Making them robust in the face of attempted sabotage will prove to be much more costly and complex exercise, but it is nevertheless needed to make them compete with human-driven quality (even while the latter itself is getting more vulnerable). Realizing the differences in ‘classic’ machine learning practices vs. those aimed to fight active fraud/sabotage is going to help along the long road ahead…

Lessons learned while building state-of-the-art Analytics Platform

In the last several years, I was blessed with a chance to be an intimate part of an ambitious program which could be summarized as the design, development and operationalization of a state-of-the-art Data and Analytics Platform at my former employer PayPal.

It’s a well-known fact that the key to PayPal’s success has historically been its effective risk management function, which in turn relies heavily on predictive models. Back in the day, it used to take up to 9 months to build and release such a model. Moreover, each ‘variable’ (feature used in the model) was in itself a separate query hitting the same database which served the core business. Such queries proved to be extremely expensive since these databases were not designed to fetch the type of data Risk management needed (transactional vs account- or for that matter, any other entity-centric data query). As a result, introduction of new variables was heavily restricted, severely limiting the capabilities of risk models. Needless to say, especially in the case of fast-moving fraud patterns, a model built on 12- to 18-months-old data is largely obsolete by the time it’s rolled to production. At some point it was pretty clear that tweaks and tricks exhausted themselves and that we could not move forward without fundamentally new approach. In a nutshell, PayPal needed to dramatically cut the lifecycle of predictive models and to radically lower the cost of leveraging more (and more sophisticated) data to keep up with both fraud trends and the endless slew of new products PayPal has been releasing. And since the efficiency of risk models is directly hitting the bottom line of the company, the decision of building a brand new Analytics Platform had a very clear ROI appeal.

Fast forward to today, and the new Analytics Platform is an invaluable asset to PayPal’s Risk Management function, enabling dozens of predictive models running in hundreds of milliseconds and evaluating tens of millions of events a day with best-in-industry efficiency rates. The lifecycle of new models is cut to a couple of weeks, and they have since evolved from basic linear regression to significantly more complex types such as neural networks relying on many hundreds of variables.

So… what key lessons did I learn along the way? Without disclosing any significant details, let me lay out some select ones in hopes that you’ll find them helpful, if not illuminating.

It. Just. Takes. Time.

Building any major infrastructure project takes time and resources. If you think you can have it up and running in 6-12 months, you’re kidding yourself (and your stakeholders). Of course, a lot depends on the starting point (how bad your legacy stack is), the capabilities of the team (we all know the difference), and particular expectations and requirements (where the devil typically resides). Still, if we are talking about building and deploying new infrastructure in a company with the size and scalability requirements of PayPal – with all the moving parts, organizational complexity and so on – we better set ourselves up for multi-year investment and prepare the upper management for ‘strategic patience’.

Simulation, simulation, simulation

In predictive analytics, your efficiency is as good as your ability to simulate what are you going to end up evaluating. No matter what spectacular results you got in your ‘data science environment’, they aren’t worth much if there is a significant gap between that environment and the production (also known as ‘reality’).

I was amazed how much the aspect of data simulation is overlooked and demoted to a role of ‘nice to have’ feature by some ‘experts’. In fact, reliable and efficient data simulation is absolutely at heart of any platform which supports predictive analytics. It means the ability to simulate any feature which could be potentially used in the output model at point-in-time of the actual event (e.g. a transaction) for which we are trying to predict the outcome (e.g. probability of turning fraudulent). By ‘reliable’, I mean that the simulated value is the same as it would have appeared in production during actual evaluation. By ‘efficient’, I mean the ability to support simulation for a large number (at least thousands) of features for a large number (tens of millions) of historic events reasonably fast (hours, not days).

Strong Engineering and Data Science teams. Even stronger Product Management function.

Even best-in-class Engineering and Data Science teams (such as PayPal’s) have somewhat opposing perspectives and mindsets. Engineers tend to be conservative and hate working on something without clearly defined boundaries. Their KPIs are typically revolving around availability and performance. For analysts, it’s never as black and white. The more data the analyst is given, the better they can do their job in predicting the target behavior. Their KPIs are all about the quality of their prediction. If engineers were the ones to decide, the platform would have strict limitations over what data to use in the models, and all changes would have to go through vigorous control and oversight. If it was up to modelers, they’d take all the data in the World (in real-time, please!), and like to have the freedom of frequent adjustments and changes in which data to rely on as late in the game as possible. These differences need to be recognized as natural and perfectly legitimate.

Here where a strong Product Management function with a good grip on both perspectives plays a crucial role. PMs are the ones to define the right framework within which we could both assess the business impact of new requirements and to evaluate their potential impact to the system.

I personally liked to apply what I called the ‘ROI model’. In that approach, the engineers should work towards minimizing the ‘I’ part – i.e. strategically reducing the cost of creation and maintenance of new variables. ‘R’ part is responsibility of the Data Scientist – it reflects how much incremental value over time we get from an individual variable. Naturally, it’s easier said than done – it’s extremely tricky to evaluate the ‘R’ part as trends move, for example – but with a transparent, well-defined methodology it’s absolutely possible.

No one single class of variables ‘wins the war’

Models are built on top of variables, and variables are built on top of data. A well-designed Analytics Platform needs to provide maximum flexibility for both introducing new data to the system and providing tools to process that data (to produce variables). Moreover, the data can be processed in a variety of ways – real-time, almost-real-time and offline. The Platform should give the users (analysts in this case) the ability to pick and choose which tools to use in each individual case – together with well-articulated costs and implications for each option. There’s no one way which would satisfy all the use cases. For example, some variables analyze trends and interconnections and need to crunch a lot of historic data, while others are all about the activity in the last several hours – or minutes – preceding the event which is being evaluated. A good Platform needs to support calculation and simulation for many classes of variables – even those we are not aware of yet – with well-defined processes, SLAs and costs for each of such classes.

I often used metaphors borrowed from the military (it started long time ago when I was bringing up to speed a new hire who happened to have an extensive military background). In case of fraud detection the parallels are not hard to find. Indeed, we are waging a war on fraud, fighting highly sophisticated and mobile adversary. No war is won with single type of weaponry, hence we need to invest into both strategic (slower, but more powerful) and tactical (more versatile and targeted) types of ‘weapons’. A good Analytics Platform is our defense industry, and the more flexible it is, the easier it is for us to adapt to the ever-evolving fraud ‘attacks’.

 

All in all, building a scalable Analytics Platform to fully enable its users, be flexible enough to adapt and ingest ever more data, as well as satisfy stringent SLAs and availability requirements is a subject one could write volumes about. But the above lessons are the ones which deserved to be mentioned in a short blog post – since they are both important and often overlooked. IMHO.

Why face recognition as a way to replace passwords will remain a fantasy

faceprintReplacing much hated (yet resilient) passwords with face recognition-based authentication has been a cool idea of ‘how things will work tomorrow’ – yet ‘tomorrow’ in terms of massive adoption never really happened. Some may argue that the stars – were not really aligned till now, but may be aligned very soon. Indeed, facial recognition methodology (naturally) keeps getting better. User-facing cameras (which just several years ago were limited to PCs equipped with an extra webcam) are now getting increasingly omnipresent – from laptops to tablets to smartphones. And the pain of remembering passwords keeps getting worse. The idea is pursued by variety of smaller companies like KeyLemon or Sensible Vision, and face recognition features even made it into Android mobile OS. Moreover, as recently as last month no one else but formidable Jack Ma demonstrated how Alipay may allow payment authorization exclusively via user’s face recognition.

So… tomorrow of “authorize with a ‘faceprint’” is finally happening? I venture a prediction that it will never graduate from a cool concept to widely acceptable practice. I can mention at least two reasons why:

  • As with any other authentication mechanisms, it’s going to be a cat-and-mouse game – the authentication technology will get better only to be defeated by ever-creative fraudsters. In cases when the attackers are inherently capable of moving faster than the defense, the ‘cat’ is kind of doomed. We could reach a point – just like it happened with captcha – when building more defenses may become unfeasible. How does it apply to the face recognition domain? The weakness of using face recognition for authentication purposes is nothing new – e.g. these guys nailed it back in 2009. True, the recognition software improved a lot since then, and some interesting ideas like detecting moving eyeball or blinks may offer a chance, but then again attacking these defenses to fool the software into false positives is becoming cheaper on a faster pace (3D printed masks, colored lenses, video-generated images?).
  • Any change in consumer behavior on a massive scale would need a push from a very large player interested in making money on it – such as Apple (case in point: mobile payments). Apple is hardly going to do it though, as its newest devices already have fingerprint readers. While fingerprints arguably suffer from the same issues, they are much more resilient biometrics – fingerprints are way harder to obtain than pictures of the potential victims (even taking this claim into account). Moreover, if we combine this observation with dropping price of fingerprint readers, envisioning even cheaper devices having one in near future is easier than imagining face recognition used as main biometrics to identify the end users. In addition, cameras can be used to scan your fingerprints instead of your taking a picture of your face. There’s little evidence that other large companies would have enough incentives to go against this trend.

Having said that I can see how ‘faceprint’ can be used as one of choices of a biometric 2nd factor, or in some physical stores which would like to appear futuristic to its customers. Maybe even some airports. Wide adoption however may remain as ‘the cool feature of tomorrow’.

(not fraud related, but…) How to identify top performers?

It’s the end of the year which is typically the time when many companies go through torturous annual “performance evaluations” process for their employees – perhaps that’s why there seems to be a renewed interest in this subject by media and professional forums. Indeed, from NPR reviewing much-hated ‘classic’ “rank and yank” approach to re-kindled debate about suddenly popped up old presentation covering unorthodox HR methods at Netflix  – it seems the methodology of how we assess employees’ performance is anything but a closed subject.

Of course, on surface any approach would argue that it is about creating great teams by selecting the “top performers” – just like the ‘father’ of forced ranking ex-CEO of GE Jack Welch explains in the recent WSJ article or Patty McCord from Netflix who talks about the importance of having “stunning colleagues” who inspire each other to deliver the best performance. This is all true, of course, but there’s one critical question to answer here – how to single out the top performers in an objective and consistent way across the entire organization which is typically huge (at small companies everything is kind of obvious anyways). The widely adopted approach here is tracking multiple types of behavior – such as “seven aspects of Netflix culture” (the number can vary depending on the company). But that doesn’t address the core issue – how to distinguish the true performer from a careerist who is more preoccupied with making a good impression than actually contributing to the bottom line?

The truth is – no matter how good of a manager you are – at the end of the day the employees with their boots on the ground working with each other on a daily basis – they know the best who is who. You just need to find a way of getting it out from them. As simple as it sounds 🙂

But how? Myself, I have been dealing with this issue for years now – having worked with dozens of peers, bosses and subordinates and going through the whole spectrum of experiences – from horror stories to collaboration made in heaven. In retrospect, while most of people have varying ‘hard’ and ‘soft’ skills which are more or less important in a particular context, at  the end of the day there was one thing which really counted. That something could be best described as an answer to a single question – “would I hire him/her to my own startup“?

…which prompts another question – can we have a single measurement which would assess an employee’s value to the company? Could we for example do the following:

At the end of each cycle we generate the list of, say, up to 20 people each individual worked with the most throughout that period. It is important to adopt a consistent approach of generating the list – and under no circumstances leave it to the employee to cherry-pick it. One way of doing it could be via analyzing the email traffic, meeting invites etc. Alternatively, the manager could come out with the list for each engineer in his/her team (naturally, the list will have to include the manager and all the direct reports by default). The point here – to come out with the full list of colleagues the given employee had interacted with (or “got exposed to”) the most. This is a separate question not really tied to the gist of the proposal, which is coming next.

Each person on the list then asked to anonymously answer a single question: “On a scale of 1 to 10, how likely are you to hire the person X into your own company?” – with 1 meaning “under no circumstance” and 10 meaning “would absolutely hire”.

Naturally, the results will be skewed towards positive (we usually do not like to throw our colleagues under the bus), but that could be taken into account – i.e. consider the answers 1 through 6 as generally negative, 7-8 as neutral, and only 9-10 as positive.

If the above looks a lot like “net promoter score” (NPS) – the similarity is actually intentional. Indeed, NPS – after a lot of wranglings about how to measure the success of a product or a company – ended up as the single point measurement increasingly embraced by various industries (the key – only! – question there is “On a scale of zero to 10, how likely are you to refer to a friend [a product or service of a given company]?”). Naturally, it’s not perfect – as any attempt to jam a complex assessment of business (or a person for that matter) into a single digit is by definition just an approximation – but it comes pretty close to what  companies want to track: market data shows a strong correlation between the NPS and financial success of a company – unlike any other single measurement out there.

My hunch (do not claim of having any real-life data) is that by applying this methodology you’d end up with pretty good understanding of who is the top performer in your organization – those would consistently fall into the 9s and 10s – while others would be more in the middle. If not the only metric, this could be at the very least used as a strong signal “from the bottom” which is better be listened to.

What’s the biggest threat to Bitcoin’s future?

Bitcoin is making headlines on almost a daily basis – startups ranging from currency exchanges to ‘virtual wallets’ keeps multiplying. Developments such as the first bitcoin ATM machines and support from large cap internet company CEO are all boding well to the future of the ‘internet independent currency’. What could threaten its future, one might ask?

It’s certainly not the technology. In fact Bitcoin’s design cleverly protects it from brute-forcing using ever-increasing processing power of the “miners”. It probably is not its vulnerability to machinations and market price fluctuations – these typically die away as the market stabilizes. It may not even be the regulatory instinct of the governments.

Instead, the biggest threat may be hidden in some of its core features such as anonymity and lack of centralization – something that may attract disproportionally higher usage for shady transactions than legitimate ones. That may eventually lead to Bitcoins being increasingly viewed as the tool mostly serving criminals and terrorists and eventually forcefully outlawed.

True, any new technology is prone to abuse. Music and movies recording on magnetic tapes gave boost to pirating (further exploding in internet era). Ease of communication enabled spreading on a vast scale – anything from homophobic propaganda to child pornography. Fast international funds transfers made money laundering lot easier. Readily available encryption tools hugely complicated lives of anti-terrorism agencies. And so on. But in all cases the nefarious usage of the new technologies has been vastly outweighed by perfectly legitimate ones – as the benefits also attracted huge segments of legitimate population. Hence “containing” these technologies historically proven to be problematic and economically unfeasible (bar “great firewalls” built by some autocratic states).

How is Bitcoin different? Well, it certainly does offer benefits to the average Joe – such as anonymity and an alternative inflation-proof tool for investments for future. But then again, how important it is to have an anonymous transaction vs other areas where anonymity matters – such as internet browsing (TOR or VPN)? Or how much would you invest into Bitcoin currency as an alternative to 401K? The fact that Bitcoin hasn’t been embraced by the millions – yet – may actually mean these benefits may not be enough for their common adoption.

By contrast, the criminals are very quick to ‘jump’ on the new currency – one could argue it’s a perfect solution to securing the most vulnerable component of the fraud rings – monetization. The best example of it is the recent outbreak of CryptoLocker virus which encrypts all your files and subsequently deletes them unless you pay the ransom – in Bitcoins, of course. Quick research on the web confirms so far that this strategy is actually working – thousands of victims ended up paying their way out. For most of them it’s the first time they are exposed to Bitcoin – and let’s face it, it’s not the most pleasant introduction.

How would BitCoin address this threat is unclear – breaking its architecture to provide some level of oversight may not be feasible. The key may be in attracting more legitimate usage of the technology – which is without doubt growing really fast right now – but the question still remains – is adoption of BitCoin by legitimate users going to outpace the exploding popularity of it among criminals?

In case you needed more evidence…

…that we are either too simple-minded, ignorant or just plain lazy to care about our own security – here’s another example.

It’s a well-known phenomena that people tend to choose simplicity over security when selecting their passcodes – from internet passwords to iPhone PINs.

I have an improvised “research” of my own. Here’s how it works – my local gym provides mini lockers for the members to put their valuables in – car keys, wallets, cellphones etc. The lockers are based on three digit (rolling 0..9) codes. The member first dials a combination of three digits, then turns the door’s knob, and finally scrambles the combination so that it is locked. Unfortunately, way too many of them forget about scrambling the code after they take their valuables out and leave.  How do I know that? I am just guessing from quickly browsing the combinations of all the open locks. The recurrent observation: in 3 out of 4 cases the combination is something extremely easy to remember – either A,A,A (where A is the same digit) or A,A+1,A+2. I could bet their smartphone pins are probably very similar, too (if I only could verify that hypothesis 😉 ).

Now, the gym I am a lucky member of is frequented by upper-middle class (it’s enough to look at its parking lot to estimate the average income of the fitness fans), young and middle-age professionals who are supposed to be more intelligent and open-minded than the average Joe is. Yet, not only they fail to come out with a pin which is slightly more sophisticated than a 5-year-old would think of, they also are careless enough to leave it “open to public” after they used the locker.

Not that you’ll find it particularly shocking anyways…

Password Haystacks

In recent months the “dead horse” of password-based authentication got some new life in the form of so-called ‘password haystacks‘. An approach introduced by well-known security expert (and one of my favorite gurus) Steve Gibson relies on the knowledge of the logic used by password brute force attackers. In essence the attackers – after trying a list of well-known passwords (“password”, “123456”, “cat” etc.), their variations (“pa$$w0rd”) and finally plain dictionary – switch to ‘pure guessing’ when arbitrary combination of alphanumeric characters and some special signs is generated and tried methodically until the password is guessed. Hence the “brute force” nature of the attack. So far the best prescription for passwords was to make them both random and very long – an advise routinely ignored by the users community as it made such passwords extremely hard for humans to remember. What Steve came out with is that passwords with similarly high “strength” (i.e. resistance to guessing) could be created by artificially increasing their length (each added character increases time needed to crack it exponentially) and the space of characters used in them (the bigger variety of small, capital case, number and special characters is used the more combinations are possible – again drastically increasing the cracking time) by, say, prepending or appending some easy-to-remember “padding” to passwords. For example, ‘000Johny000’ is infinitely harder to brute force than ‘johny’ – yet it requires comparable effort for humans to remember them. Makes perfect sense – you come out with your own secret “padding” pattern, and use it to enhance your simple but consequently easy-to-guess passwords. Once enhanced such passwords are both easy to remember and hard to crack (get more detailed explanation from the source here). Sounds like a perfect solution, huh?

Up to the point. While the “haystack” approach certainly adds to the password-based security – it is hardly the end of the game. Like anything else in security, password attacks are never ending cat-and-mouse game between the ‘locks’ and the ‘keys’. Thus it’s a matter of time till fraudsters update their password guessing algorithms/tools to check ‘popular padding’ patterns first before switching to ‘pure brute forcing’. Not to mention the possibility of ‘leaking’ your password in some other way (e.g. through phishing site) thus revealing the “secret sauce” of all your strong passwords – the “padding pattern” – to the attackers.

At the end of the day, as often mentioned in the past, passwords as viable protection mechanism are pretty much dead (mostly). Indeed, other approaches like multi-factor authentication have no real alternatives no matter what clever way we come out to make our passwords less guessable.

Automated spear phishing – a perfect storm?

Back in January one of my 2011 predictions for “cyber fraud story of the year” was having more targeted yet massive phishing attacks. Two biggest news trends in cyber security seem to be indicating that this threat can actually become real in 2011:

  1. highly effective attacks targeting what one would expect to be the most impenetrable companies whose bread and butter is cyber security – RSA and Oak Ridge National Lab. The frequently used term to describe these attacks is “Advanced Persistent Threat” – but in reality what hides behind is a successful spear phishing attack.
  2. repeated exposure of massive amounts of user personal data – names, emails, addresses, and in some cases even dates of birth, credit card numbers (!) and SSN (!!). Just a couple of breaches in recent months exposes the scale of the problem:

Spear phishing attacks have always been considered a highly targeted version of a cyber attack tailored to the potential victim’s profile (root – phishing with a ‘spear’ rather than a ‘wide net’). RSA and Oak Ridge National Lab breaches are yet another confirmation of the efficiency of such attacks. Typically targets of spear phishing attacks are senior executives (sometimes spear phishing is referred to as ‘whaling’ for that particular reason) or companies which represent a hefty prize to the fraudsters community.

Could usually hand-crafted spear phishing attacks be automated and put on a massive scale? I don’t see why couldn’t they (most probably to some extent they already are). As common knowledge in the industry goes, a simple addition of victim’s name in the phishing email’s opening line drastically increases the probability of the end user trusting the message (and then clicking the link). Add to it the knowledge of the companies the victim has an established relationship with, the phone (BTW, has anybody thought of automated phone attacks?), address – and the attack can be personalized to a degree that an ‘average Joe’ stands no chance of distinguishing it from the email communication coming from the real business.

To be sure exposure of user data in itself is a very dangerous phenomena. In addition to “old-fashion” identity theft, stolen user data can be applied in other types of attacks – such as password guessing (your name is John and you were born in 1970? Chances that you use one of ‘john1970’, ‘Johny70’, ‘JOHN70’, etc. are infinitely higher than a dictionary-based random gibberish). However, marrying phishing attacks with intimate knowledge of victim’s data may prove to have the most severe and widespread impact.

What will happen when spear phishing goes massive? Hopefully, it’ll speed up the adoption of well-known counter-measures. For businesses – discipline storing user data and adoption of 2FA. For end users – a practice of using different passwords across different sites (should be as weird as using the same key for unlocking your house, car and the office), not clicking on links in your emails (should be as weird as opening your door to a stranger) and keeping your personal data away from the rest of the World.

The best cyber security practices are…

…the ones which don’t expect any action or assume any expertise from the end user. Naturally.

I did try to make a case for ‘no substitution for user education’ several years ago. However, clearly, with explosive penetration of Internet being as ubiquitous and essential service as phone or even water & electricity the prospect of having a security-savvy user base – capable of understanding the difference between HTTP and HTTPS, or paypal.com and paypal.abc.com – keeps getting further away. Indeed, the answer to growing cyber fraud threat cannot rely solely on an assumption of average netizen’s abilities to detect and fight back the ever sophisticated attacks from the bad guys. Continuing the analogy with physical security it’s equivalent to saying “let’s assume all good guys have a gun and know how and when to use it to defend themselves”. This strategy might have worked in the Wild West (if it did), but has poor chances in the 21st century’s Cyber World  (sorry, NRA).

Not surprisingly, the industry slowly but surely moves towards, let’s call it, “built-in security”. The shift in mindset could be characterized by security considerations becoming more of a driver and less of an afterthought.

For example, it’s well known that many users chronically fail to patch their computers – operating systems and applications (browsers, PDF readers, Java VM, etc.). That leaves them wide open to ‘exploits in the wild’ – inevitably resulting in data being stolen, machines being infected and getting ‘enlisted’ to a botnet. In order to address this situation more companies are switching to ‘stealth update’ mode. For instance, unlike its competitors, Google’s Chrome chooses not to ask the user to initiate an update – it does it silently without users even knowing it. Windows 7 seems to adopt the same approach – by default the users are not asked to perform any action to have their operating system to be patched.

The same rule applies to other security measures. Facebook recently introduced a nice feature enabling switching its traffic to HTTPS. Alas, the option is off by default and the 600 mln users are expected to go to their account settings and turn it on manually (most probably Facebook was afraid of the cost of wholesale movement to HTTPS). Again, Google shines here. Not only it moved all its gmail service to HTTPS well before Facebook did, it also made it universal and by default – no user action was expected. I bet vast majority of gmail service users didn’t even notice the change. Another less known example is also recently introduced Strict Transport Security which allows web servers to prevent non-secure (or even suspicious) connections in order to prevent man-in-the-middle attacks. Again, “average” users need not to even know the mechanism exists.

These trends are bound to gain momentum. I imagine more and more companies will switch to HTTPS in the near future, and patching will not require user confirmation by default (perhaps leaving an “ask me first” before updating option – off by default – for tech-savvy – or perky – users). More services will move away from simple password-based authentication. Microsoft Essentials will become an integral part of the Windows OS (if anti-trust allows them to do so). Applications will become increasingly sandboxed. And so on…

This is not to say that one day you will be able to survive in the Cyber World without some basic knowledge and prudence – just like you need some common sense to live everyday life – from how to cross the street to avoiding dangerous neighborhoods. However, that knowledge should be kept to minimum, be intuitive, be transparent and belong to public domain and even school (kindergarten?) curriculum.  In the end the rules should be simple enough that – unless you are striving for the Darwin Award – by following them you are not risking your (cyber) well-being. The rest should be taken care of by the smart technology. Ideally.

Superiority of the “known good” over “known bad”

Okay, some definitions first:

  • Known bad” strategy implies covert collection of attributes used by the fraudsters – first of all devices, but also email addresses, phones etc. – in order to be able to detect repeat usage of them. It’s essentially blacklisting technique, implying that if you are not blacklisted, you are good to go.
  • Known good” is pretty much the opposite – it’s an overt policy of collecting the attributes – first of all devices, but also email addresses, phones etc. – to have necessary assurance of the legitimacy of their usage by the good guys. It’s effectively white listing, implying that if you are not whitelisted, you are a potential suspect. Naturally, to make an attribute whitelisted (or to mark it as ‘trusted’), the users will have to go through a certain verification process. For example, to whitelist a machine – the user will have to enter a code sent via email or SMS (essentially, following a 2FA approach).

Now, traditional strategy adopted by the cyber security guys has always been the first one – just like in “offline” life where we all enjoy presumption of innocence (unless we slide into totalitarian form of government) and where the “blacklists” are for few suspected criminals. It definitely is more intuitive and, to a certain degree, effective way of raising the bar in the online security. However, it becomes increasingly inefficient as fraudsters get more sophisticated in hiding their identity. Indeed, only lazy or grossly uneducated fraudsters do not delete their cookies (historically, number one way of identifying a device) today. Adobe’s FSO – which succeeded the cookie – is next to fall. Soon the larger fraudster community will discover the beauty of sandboxing. In essence, it’s a matter of appropriate tools being developed and available on the “black market” – average fraudster doesn’t even have to know all the gory details to use them. Thus, as I mentioned in my previous post, device fingerprinting is pretty much doomed.

By contrast, the “known good” strategy is increasingly getting traction in the online businesses. Initially unpopular since they introduce another hoop for the legitimate users to jump through (businesses hate that), it just by definition works much better. Fraudsters now need to get an access to the victim’s email account, cellphone, or hack the computer to get around it (it should also be mentioned that on a conceptual level the superiority of whitelisting over blacklisting is apparent in many other cases – such as in keeping user input under control).

The switch to “known good” is not a painless exercise and, yes, it introduces an additional hurdle to the business, but it may prove to be the cheapest way of putting a dent on losses by making account takeovers much more difficult to hide. Both in terms of nuisance to the users and the cost it fares much better that some extra measures I see on many websites – such as selecting an image, asking additional questions etc. – thus my take is that the popularity of “known good” approach will continue to rise.