MRTG 95th Percentile

This is Mirrored Information from SeanAdams.com Website

MRTG 95th Percentile

95th Percentile Billing is the Standard way Bandwidth is billed by ISPs and the standard way that ISPs are themselves billed. Typically it means that samples are taken every 5 minutes for a month and the top 5% are discarded-- this gets rid of the so called "spikes". The client is then billed based on 95th percent of their bandwidth usage. ISPs have to reserve capacity on their fiber for you and they pay each month - just like you do.. If you are using a certain capacity even for one day then that means that no one else can use it and therefore that capacity has to be reserved for you even if you only use it for one day! This is why ISPs don't usually like "spikey" traffic -- clients with "spikey" traffic don't like to pay for the whole month for their traffic if they only need it one day a week. And unless an ISP has purchased more bandwidth then they are using, they typically will want to bill you the same way that they themselves are billed-- based on 95th percentile - the fair capacity bandwidth measuring tool. Typcially you will see either MRTG, RTG, or CACTI graphs but they are all basically the same -- graphing samples of your traffic.

MRTG is an SNMP monitoring/graphing program by Tobias Oetiker and Dave Rand.

 

MRTG Graph

What is the 95th percentile, and why is it useful in measuring bandwidth?

The 95th percentile is the smallest number that is greater that 95% of the numbers in a given set. The reason this statistic is so useful in measuring data throughput is that is gives a very accurate picture of the cost of the bandwidth. Here's an example. Suppose an ISP sells you a T1 line, but you're only using it to access the web. Even though you might frequently download very large files (filling the pipe) your cost to the ISP is negligible, because your usage is intermittent. A single T3 connection to the backbone could easily support hundreds of such downstream customers, and never become saturated. As another example, suppose you are hosting a very busy web site that half-way fills your T1 for several hours every day. This type of bandwidth is more expensive, because your ISP can't oversell their connection to the backbone as effectively. The important thing to realize is that it doesn't cost your ISP anything to sell you a pipe of any particular size - it is the sustained rate of data transfer that costs them money. The sum of the 95th percentile usage of all of an ISP's customers predicts the peak amount of backbone traffic that the ISP will incur (in a given direction).

Here are some examples. ISPs must charge for bandwidth by one of three means:

  1. Sell a flat rate, possibly bandwidth limited connection, and try to sell to customers whose usage patterns are not so intese. Nearly all DSL providers do this. The customers like it because they don't have to worry about how much bandwidth they use, and ISPs like it because it simplifies billing, and they make more money as long as they have plenty of low-usage customers. The problem, particularly if the ISP is selling very fast connections, is that the ISP can become overwhelmed by even a small number of high-usage customers. Even residential customers can be such high-usage clients, thanks to recently popular services such as peer-to-peer file sharing.
  2. Sell a fast connection (eg 100Mbit Ethernet, which is inexpensive) and charge for the volume of data transfer - eg number of Gigabytes per month. This model works great for web sites, which almost always generate traffic in a predictable bell curve. However, it severely penalizes customers who use bandwidth intermittently. For example, suppose a customer runs an automated off-site backup every night. This brief usage spurt costs the ISP almost nothing. Although the recurring sustained data rate is low, the customer gets charged for a huge amount of bandwidth.
  3. Sell a fast connection and bill by 95th percentile. By now this should make sense - it's a fair system where everybody pays for what they get. The advantage to the customer is that they get the performance of a high-speed connection, while paying only for their actual usage. ISPs like it because they don't have to worry about high-usage customers upsetting their overselling ratios.

Irrespective of billing concerns, the 95th percentile is a very interesting and useful figure. Bottom line is it tells you how much of your connection you're really using (and really need).

Network Layers

One common source of confusion amongst those new to the world of datacenter networking involves the use of the terms "layer 2 switching" and "layer 3 switching."   Some of you will no doubt be familiar with the seven layer OSI model, which divides computer networking up into seven conceptual layers, based on function.  For those of you who aren't, our discussion must begin with an overview of this model.  First, some terminology:
 
Host - An individual computer, connected to a network (or the Internet).
Network Interface Card - The ethernet card or wifi adapter on your computer.
Address - a unique identifier for another computer on a network.
 

Layer 1 - Physical

The physical layer consists of the ethernet and fiber optic cables that link computers together, as well as the wireless signals in wifi networks; it also consists of the raw hardware that transmits and receives data on each end.  An example of a layer 1 device would be a hub; hubs, though now mostly replaced by switches, were at one time commonly used to connect multiple servers or workstations into a network; they are pure layer 1 devices in that they take an input signal and repeat and amplify it through all of the connected interfaces on the device.   This introduces performance problems; only one attached host can transmit data at any given time.  When two hosts try to transmit simultaneously, the result is a collision, and each host will then use an algorithm to wait a random amount of time before retransmitting (this is called 'back-off').   The more hosts you have connected to a hub, the more collisions that occur.  Since the time wasted on collisions, and on waiting to retransmit, is time not spent sending information, the result is that the network becomes slower and slower the more hosts you connect to it, until finally it reaches an manageable state.   
 

Layer 2 - Data Link Layer

This layer is the most minimal logical layer, above the physical layer.  At layer 2, devices communicate via MAC address, a value that is generally hardcoded into the Network Interface Card.   Each host on a network has a unique MAC address; when one host wishes to send a message to another, it, at the lowest level, sends an Ethernet frame to that specific MAC address.   The classic layer 2 device is the common Ethernet switch.  A switch, instead of sending all received traffic out all of its interfaces (which is how hubs operate), will instead learn which hosts are behind specific interfaces, and send traffic addressed to those hosts through that interface only.   These devices are what are commonly known as "layer 2 switches."   They are extremely fast, but have one weakness, which we will explore next.
 

Get Social with CalPOP

Join us on Google+!


Add us to your Circles for updates!


You can also LIKE us on Facebook!

 
 


 

And on Twitter:

And for NOC/Network updates, check our NOC twitter page: http://twitter.com/#!/calpopNOC

Follow us on Linked In:

Review us on Yelp: http://www.yelp.com/biz/calpop-los-angeles

What Is The Cloud?

What is the Cloud?

For some, "The Cloud" is a visable mass of water particles in the sky above. For others, "the Cloud" is an ambiguous term for the internet as a whole. As in the old network diagrams and graphics where your local network of computers connects to a cloud shaped representation of the public internet.

While the term is still a rather generalized term for the internet, but is now being branded into more specific services.

Cloud computing

Cloud computer takes computer tasks which were once confined to a single machine on your desk and moves them onto the internet. This would include pretty much everything a desktop or single server once did. Some examples can be seen in the graphic below:

Cloud Computer Services

At Calpop you can design and build your own cloud. Using one or many of our Dedicated Server's you can provide data storage, databases or application services to your entire office (a virtual private cloud) or the public (a public cloud).

Very soon we will be offering Virtual Servers where you can adjust the amounts of system resources you are using on the fly, as needed.

What Is Colocation?

What Is Colocation?


If you're new to the hosting world, its common to wonder what the difference is between dedicated servers, colocated servers and other types of hosting plans.  This blog series from CalPOP will address these questions common to those new to the world of web hosting and datacenter services.

Colocation 101

Colocation, like dedicated server hosting, is provided from a datacenter facility such as CalPOP's world-class Los Angeles datacenter.  The difference between dedicated server hosting and colocation hosting is that with colocation hosting, the customer provides the server equipment to be colocated, and is typically responsible for its maintenance and upkeep.   The colocation hosting customer pays the datacenter services provider a monthly, quarterly or annual fee that covers the cost of the space occupied by the colocated server, and the power and network bandwidth it consumes.   The colocation hosting customer typically purchases their server hardware from a major systems vendor such as Dell, HP, IBM, Oracle Sun, or SuperMicro.

CalPOP 's colocation offerings are designed to meet the needs of everyone from small web2.0 startups to major corporations.  Smaller customers can colocate a single server, or multiple servers within shared rackspace.  Enterprise customers can rent one or more cabs or an entire private cage of our carrier-grade datacenter space.

The CalPOP Los Angeles datacenter is a world-class facility featuring redundant cooling, power and network infrastructure.   Its very important when ordering colocation hosting services to select a datacenter provider that has redundant and reliable infrastructure. Our facilities are designed to meet the needs of the most demanding users, at a competitive price.

 

What Is A Dedicated Server?

What Is A Dedicated Server?

 
If you're new to the hosting world, its common to wonder what the difference is between dedicated servers, colocated servers and other types of hosting plans.  This blog series from CalPOP will address these questions common to those new to the world of web hosting and datacenter services.

Dedicated Server 101 

A dedicated server is a web hosting server located in a datacenter facility such as CalPOP's Los Angeles datacenter, that the customer rents from the facility for a monthly fee.   The dedicated server remains the property of the hosting company, and the hosting company is responsible for all maintenance and upkeep on the unit. The customer pays a monthly, quarterly or yearly fee inclusive of the cost of renting the dedicated server, the cost of the space that the dedicated server occupies and the cost of the electrical power and network bandwidth the dedicated server consumes. 

CalPOP provides a range of dedicated server hosting operations designed to meet the needs of everyone from small web 2.0 startups to major corporations.  Our efficiency dedicated servers use the latest eco-friendly Atom CPUs from Intel for maximum green performance, whereas our enterprise and performance server lines use high power Intel Xeon CPUs for the ultimate in enterprise speed and reliability.

 

 

 

Setting up Linux Firewall

Adding ConfigServerFirewall to your server. 

While Linux comes with built in firewall software, it can be a bit difficult to configure correctly. Fortunately, there is a sweet, open-source, software product available to make it much easier. This product is: Config Server Firewall, available at http://configserver.com/cp/csf.html.   

This software runs as a plug-in for cPanel, DirectAdmin, and Webmin, but needs to be installed via the command line (SSH as root).

To install run the following commands as root:

# wget http://configserver.com/free/csf.tgz
# tar xzvf csf.tgz
# cd csf

If you have cPanel installed:
# ./install.cpanel.sh

If you have DirectAdmin installed:
# ./install.directadmin.sh

Otherwise:
# ./install.sh

Once installed you can access the configuration via cPanel/WHM's Plugin's section or in the lower-left column of DirectAdmin's main Admin page. If you use Webmin, goto the Webmin Modules page and install a new module from a local directory. The module will be located at /etc/csf/csfwebmin.tgz.

 

Setting Up Splunk Free Edition on Ubuntu Server 10.04, and Securing it With an Apache Proxy and iptables

Whether you're managing one device or 1,000, Splunk (http://www.splunk.com/) is a useful product as it allows you to aggregate and search diagnostic information from a variety of systems.  At CalPOP we use it as a central syslog server, allowing us to view the logs of our several hundred Cisco and Juniper switches and other infrastructure elements in one central place, search for specific events, and build reports and dashboards to track performance.   If you're operating on a small scale, you can use the Free Edition of Splunk, which allows you to index up to 500 MB of data per day.   The Free Edition will likely cover you until your environment reaches enterprise-scale (think hundreds or thousands of servers), at which time Splunk will be more than happy to take your money.

The Free Edition of Splunk has one irritating drawback, however: it lacks any form of built in user account management or authentication.  We will (partially) address that shortcoming in the course of this tutorial.

First, download Splunk from their website, and upload it to the home directory your server.  If you're running Ubuntu, which is what we use for our infrastructure within CalPOP, you can use a .deb package.  There is also an .rpm for distros like CentOS and Fedora, and a tarball for everyone else.   Once you've uploaded it, if you're on Ubuntu (and presumably Debian, although I've yet to try it on that much-venerated distribution), run this command:

dpkg install splunk*.deb

Next, start Splunk: 

/opt/splunk/bin/splunk start

Now, enter http://your-server:8000 (where your-server is obviously the hostname or IP address of your box), and you'll enter the web interface.   You might well poke around for a bit.  After you've had your fill, and converted the license to the Free Edition (see the Splunk installation documentation for instructions on how to do this, by default, it runs as a 30 day trial of the paid version), log out again.   When you return to the site, you might notice, to your alarm, that it lets you straight in, without prompting for so much as a password to keep curious visitors away.   This naturally may pose some concern for anyone who wants to run Splunk on a world-accessible box; you certainly do not want just anyone to be able to search through your syslogs, or other diagnostics data indexed by Splunk.