jeudi 8 décembre 2016

Linkedin Influencers & Silent Evidence

This post will be a short one!

Many of us follow influencers on Linkedin, like CEOs of big companies such as Bill Gates, Jeff Bezos and Mark Zuckerberg. We follow their advice, read their "5 rules to change your life", and share their posts with our community. These people inspire us, and make us want to follow their path to replicate their success pattern in our careers. For example, I personally follow Laszlo Bock, Google's VP of Humain Resources, as I found interest in his advice around resume and job application. 

Nevertheless, following influencers to understand success is biased and can probably lead to wrong conclusions. It is the survivorship bias, or also silent evidence as called by Nassim Nicholas Taleb in his book Black Swan. He illustrated it by the following story:

Diagoras, a nonbeliever in the gods, was shown painted tablets bearing the portraits of some worshippers who prayed, then survived a subsequent shipwreck. The implication was that praying protects you from drowning. 
Diagoras asked, “Where are the pictures of those who prayed, then drowned?”

Do we know how much people applied the "5 rules to change your life" but failed and never made it to be Linkedin influencers? Until we do, we should stay skeptical, and do not take whatever a top influencer gives us as truth, but rather try to validate it.

jeudi 17 novembre 2016

CDN benchmarking

Today, when you want to compare the performance of different CDN providers in a specific region, your first reflex is to check public Real User Monitoring (RUM) data, with Cedexis being one of the most known RUM provider. This data is very useful, and some CDN providers buy it in order to benchmark with other competitors and work on improving performance when there are gaps.

I will highlight in the following what exactly RUM measures, so you do not jump quickly to some unprecise conclusions. Let's focus on the latency performance KPI and list the different components that contribute to it:
  • Last mile network latency to CDN Edge on the ISP network, which reflects how near is it to the user.
  • Caching latency, which is mainly due to CDN Edge not having the content and must go back to the origin to fill it.
  • Connectivity latency from CDN edge to Origin.

In general, RUM measurements are based on calculating the time (RTD) it takes to serve the user a predefined object from CDN Edge. Since the object is the same and doesn't change, it's always cached on edges, thus measurements reflect solely the first mile network latency. But that's not all of the picture, because in real life CDN edges needs to fill content from the origin:
  • According to the cache purge policy and the disk space available, the request will be a cache miss. The less TB capacity is present on the Edge, the more will be the caching latency of the CDN.
  • According to CDN backbone, the more hops you need to cross to reach the origin, the more will be the connectivity latency. On this aspect for example, tier 1 IP networks who provide CDN services are very optimized.
For highly cachable content, the comparaison based only on the first mile latency makes sense, but it has limits when it's not the case, such as for long tail video streaming or dynamic content.

lundi 14 novembre 2016

Sales Engineer, modus operandi

Recently, a friend asked me : "What qualities would you look for if you had to recruit someone in the same position as you, i.e. Sales Engineer?", and of course, to make it easier, he asked me how to evaluate these qualities. In this post I will try to answer the question, which is a very interesting one, because it pushed me to stand back a bit and think about my role with detachment.

Sales Engineer (SE) role can be quite different from a company to another and with different titles: Solutions Engineer, Presales Engineer, Solutions Architect, Consultant... In fact, SE can be more or less specialized/generic, more or less involved in delivery, in pricing, in bid management...  In a nutshell, SE, part of the sales team, is a professional who provides technical advice and support in order to meet customer business needs. It seems nowadays that companies are having difficulty finding such profile who combines business acumen and extensive technical knowledge.

The first word in SE is "Sales", but let me start by "Engineer", because technical knowledge is the solid foundation on which trust is built with customers. Indeed, SE must be ready to dive in a technical subject as deep as required by business, understand customer problems and solve them. Nonetheless, static knowledge is not sufficient in a fast-paced and changing technological landscape: This is where the curiosity of SE and his passion for learning are vital for his "survival".
Let's take the case of SE specialized in CDN. He knows very well how internet works, TCP/IP stack, DNS system and HTTP protocol.  He can explain on high level how caching works, but also can dig into HTTP RFC 2616 if needed to answer a specific question about caching. He is following latest trends in his industry such as H2, TLS, SDN, security and looking closely at what is being done at the competition.

So, back to the "Sales" in SE. First, the company counts on the alignment of SE to sales targets, as well as on his strategic thinking in order to create competitive advantage for its products. Second, SE is regularly doing presentations or demos to customers, thus he needs to have good communication skills, to excel at storytelling and to be attentive to his public in order to adapt in real time. Finally, I would say that SE must handle stress and pressure due to sales dynamics. A typical assessment of this skill set would be asking the SE to make a presentation and challenge him during it.

Now the best of SE is in the synergy between "Sales" and "Engineer". Being able to deliver a technical pitch with a variable depth, SE ties relations and build trust at different levels within customer organization. For exemple, our CDN SE brings value to customer's CTO by explaining how its product can help him to increase revenues by improving the online buyer experience, or cope with Christmas load on his infrastructure, and in the same time spend time advising customer's website admin on caching best practices for an optimal CDN setup. By discussing with customer, and asking the right questions, he is able to translate business requirements into a technical solution.

In addition, some other skills are very nice to have in this role, such as coaching partners, training sales, managing some projects... 

I am lucky I have the opportunity to be in an SE position since more than 5 years now, which is a position that sits in a special place within the organization, at the crossroad of sales, engineering, business development, product management.... This role has changed me a lot and I already feel the opportunities that are opening to me.

mercredi 2 novembre 2016

How to evaluate a DDoS mitigation solution?

Let me start by this funny story. Marie, a 16 years old school student, was our guest in the office for a week to discover the professional world. We explained to her about our business, networks, internet... but when we started talking about IT security threats, we were hilariously surprised: She confessed about having already launched a DDoS attack on the school website, so her parents can not access her results on that day where grades were available online!!!! 

With almost no entry barriers for launching DDoS attacks nowadays, the industry is witnessing considerable growth in the number and size of attacks. Unprotected connected objects have even driven this growth exponentially with IoT infected botnets being massively used as attack vector. In the last 30 days, KrebsOnSecurity  got 600 Gbps attack, OVH over 1 Tbps attack. The last attack was on Dyn DNS provider whose failure impacted major internet services such as Netflix. Mirai Malware was used to launch this attack, by scanning and infecting more than 500 000 connected cameras, DVRs...

To protect themselves, companies are dedicating a larger percentage of their budgets to security and thus creating a very attractive emerging business for providers from different horizons. We can list vendors who started providing services based on their technologies (Abor, Radware), network operators (Level3, Tata), CDN providers (Limelight, Akamai), Security providers (Incapsula, Cloulfare) and cloud providers (Azure, Rackspace).

Each positioning and implementation has its strengths and weaknesses. In the following I'll share with you some key technical elements to take into consideration when you are evaluating a DDoS mitigation solution. 

I'll start first by a quick description of DDoS attack layers. Indeed, attacks target ressources at different layers, each one of them is critical for service continuity. Volumetric attacks either try to flood internet bandwidth mostly with reflection mechanisms (DNS/NTPCHARGEN..) or overwhelm frontal network equipment, for example by exhausting router CPU with packet fragmentation. In the upper layers, an attacker can target the middleware such as HTTP server by brute force GET requests and slow session techniques, or the application layer directly by well crafted HTTP requests that exhausts the application logic or its database. 

Providers should be able to protect from DDoS attacks on different layers:
  • Protection from volumetric attacks requires a considerable infrastructure capacity in terms of network and scrubbing centers.
    Indeed, scrubbing centers count and geographical distribution is critical to absorption capacity and robustness, as they would mitigate an attack the closest to it's source before forming an avalanche, which is much riskier to handle afterwards. It is the case of a provider lacking presence in some regions like APAC, or having only one scrubbing center in a specific region. With the same concerns, scrubbing centers should be connected to internet through extensive peerings and network capacity.
    On this ground, tiers 1 operators have the best position to deal with the larger attacks (e.g. 1Tbps) thanks to their scale. For example, Level 3 has implemented BGP flowspec on its backbone, thus leveraging its edge capacity (42Tbps) to block some volumetric attacks before even they get into scrubbing centers.
  • Protection from upper layers DDoS attacks is more about the intelligence in the scrubbing centers.  Some providers use proprietary technologies like Radware, Arbor or Fortinet, some mix them for better security, and some simply do not use any to avoid licencing fees and thus be more competitive price-wise. What is the underlying technology capable of? signature based only or enabled with behavioral analysis? Manages SSL traffic? false-positive ratio? Compatible with hybrid (cloud+on premise) implementation?
  • In all cases, mitigation should be powered by threat intelligence capabilities. For example, a botnet can be identified before any attack by its communication profile with C&C servers, and the associated infected IPs are fed to the mitigation technology.

One last thing I want to mention is performance. It's not enough that providers stop a DDoS attack, they shoud guarantee that normal traffic won't suffer from performance issues. Let me illustrate by some examples:
  • A large percentage of your traffic is very local in a region (let's say Middle East) where a provider does not have a scrubbing center. That means your traffic will go to Europe to be scrubbed, and then back to the Middle East, thus adding considerable latency. 
  • Your provider has only one scrubbing center in Europe, and gets critically impacted by attacks on several banking customers of his. In this case, your traffic will be rerouted to the nearest scrubbing center, for example in North Amercia, thus adding considerable latency.
  • Routed mitigation solutions uses BGP to divert your traffic on a /24 subnet from your AS to the AS of your provider who will clean it and send it back to you. First thing to consider is the BGP convergence time because it impacts the global time to mitigate. Convergence time decreases when your provider is very well connected to internet, and can even be instantanous if you are using the same provider for internet connectivity. Second thing to consider is the impact of rerouting all of your /24 subnet when only one host is targeted. Does your provider give you the possibility to reroute only the attacked IP?
  • Your provider is using a scrubbing technology that requires intensive tweaking and humain intervention per attack mitigation. In this case, you can expect a longer time to mitigate.

lundi 10 octobre 2016

Networks & economic paradigms

I will start by the above revisited version of Maslow's pyramid for human needs. It's a funny expression of how internet is now a basic need, making all of us "data" consumers. Data delivery to consumers is organized in different ways, according to different economic models. In the following I will go through these different ways and their correlation to known paradigms for organizing economy: liberalism, centrally planned economy & participatory economy.

The first paradigm is current internet decentralized organization which is based on liberalism or free trade, the dominant ideology nowadays. A user is connected to internet through eyeballs, e.g. the local ISP or mobile operator. Now eyeballs are connected to internet via different kinds of peerings:
  • Directly peer with content providers such as Google, Amazon & Netflix,
  • Peer with other regional eyeballs to exchange traffic directly,
  • Peer with backbone providers such as Level 3 & Cogent who globally connect eyeballs together.
The dynamics driving network meshing are very interesting. An eyeball has many questions to answer in order to guarantee a good internet connectivity and a profitable business:
  • Which of the above peering kinds should we build? in which breakdown?
  • With which networks should we peer? what capacity? private peering or through internet exchanges?
  • In which geographical locations should we peer with a considered network? in which carrier hotels?
  • What is the cost of transport network to those locations?
  • Should we pay for a peering or is it free?
  • How should we diversify peerings to guarantee resiliency?
The main driver of building internet networks is making profit because it is managed by private sector. Meanwhile, we have witnessed competition, price compression, innovation, rise of broadband.... But making profit means only serving solvable consumers, which has lead to the below unfair image of the world in terms of network connection density.

The second paradigm is the organization of access network of a national eyeball which is based on central planning. Indeed, network expansion and deployment is planned according to usage forecasts given by marketing studies, and it's decided centrally by the eyeball in a top-down approach. The main driver of building such kind of networks depends on whether it is a public or private service, and on the telecom regulation pressure, thus leading to more or less fair network coverage.

The main advantage of this model is a more rational and efficient use of resources (network assets, people) to satisfy the present and future needs of population. On the technical plan, network is more controlled and thus potentially providing better service, for example:

  • Traffic types (voice, download..) are differentiated and quality is manged from end to end.
  • Traffic routing is better controlled with any chosen protocol, where on internet only BGP can be used with its limitations.
  • Some specific techniques can be used to optimize the network usage, such as mutlicast for video streaming, where it's almost impossible on internet.
But on the other hand, planning cycles have important inertia and often can't cope with demand dynamics. Moreover, eyeballs are not leaders in terms of innovation, for example, the ongoing SDN/NFV revolution in networks is driven by software companies like Google.

The third paradigm is the FON model, based on participatory economy, i.e. network is crowd sourced by users themselves.

As explained in the above video, a Fonero (user participating to the FON network) shares its home WiFi connection with others in a secure way, and thus has access to others' WiFi anywhere anytime. By making use of the idle bandwidth on your internet box, you gain access to thousands of hotspots around the world for free. It's the same concept of P2P for downloading files on internet.

The main driver of such communities is making the world a better place in a bottom-up approach. Agility, innovation, open standards & free service are keywords in this model.

As a final word, I personally believe in a 4th paradigm which is a mix of the last two. I will try to develop it in a future post.