{"id":1567,"date":"2015-01-08T17:46:45","date_gmt":"2015-01-08T23:46:45","guid":{"rendered":"https:\/\/justinparrtech.com\/JustinParr-Tech\/?p=1567"},"modified":"2015-01-08T20:08:59","modified_gmt":"2015-01-09T02:08:59","slug":"elliptical-distribution-curve","status":"publish","type":"post","link":"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/","title":{"rendered":"Elliptical Distribution Curve"},"content":{"rendered":"<p>How to guestimate peak volume, and volume at any arbitrary time using total volume with an elliptical distribution curve.<\/p>\n<p>Someone says, &#8220;we have 10,000 hits per day on our website&#8221;, but what does that mean from an instantaneous demand standpoint?<\/p>\n<p>A distribution curve can help you figure that out.<\/p>\n<p>&nbsp;<\/p>\n<p><!--more--><\/p>\n<p>&nbsp;<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\"><p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<\/div><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#simple-formula-for-guestimating-peak-sessions-for-a-given-volume-of-traffic\" >Simple Formula for Guestimating Peak Sessions\u00a0for a Given Volume of Traffic<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#distribution-curve\" >Distribution Curve<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#histogram\" >Histogram<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#distribution-patterns\" >Distribution Patterns<\/a><ul class='ez-toc-list-level-4' ><li class='ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#bell-curve\" >Bell Curve<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#linear-distribution\" >Linear Distribution<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#cosine-curve\" >Cosine Curve<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#elliptical-distribution\" >Elliptical Distribution<\/a><\/li><\/ul><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#generating-an-elliptical-distribution\" >Generating an Elliptical Distribution<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#calculate-peak-volume\" >Calculate Peak Volume<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#defining-the-timeline\" >Defining the Timeline<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#example-1-peak-volume\" >Example 1:\u00a0 Peak Volume<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#estimating-volume-at-a-given-time-slot\" >Estimating Volume at a Given Time Slot<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-14\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#example-2-estimating-volume-at-2-pm\" >Example 2: Estimating Volume at\u00a02 PM<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-15\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#example-3-estimating-volume-at-5-pm\" >Example 3:\u00a0 Estimating Volume at 5 PM<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-16\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#analysis\" >Analysis<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-17\" href=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/elliptical-distribution-curve\/#summary\" >Summary<\/a><\/li><\/ul><\/nav><\/div>\n\n<p>&nbsp;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"simple-formula-for-guestimating-peak-sessions-for-a-given-volume-of-traffic\"><\/span>Simple Formula for Guestimating Peak Sessions\u00a0for a Given Volume of Traffic<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>I&#8217;ll give you the goods up front.<\/p>\n<p style=\"padding-left: 30px;\"><strong>p = ( 4 * V ) \/ ( pi * t )<\/strong><\/p>\n<ul>\n<li>P is the peak volume per hour, for which we&#8217;re trying to solve<\/li>\n<li>V is the total volume for the time period<\/li>\n<li>t is the number of hours in the time period<\/li>\n<li>Pi is the constant 3.14 (etc&#8230;)<\/li>\n<\/ul>\n<p>Simple example:<\/p>\n<p>If you&#8217;re building a website that will run a business application, and you&#8217;re expecting 20,000 sessions per day, randomly distributed throughout the day, between 8 AM and 5 PM, then we can calculate the peak as follows:<\/p>\n<p style=\"padding-left: 30px;\">T is the time in hours, or 5 PM &#8211; 8 AM + 1 hour \u00a0(10)<\/p>\n<p style=\"padding-left: 30px;\">p = ( 4 * 20,000 ) \/ ( 3.1416 * 10 )<\/p>\n<p style=\"padding-left: 30px;\">p = 80,000 \/ 31.416<\/p>\n<p style=\"padding-left: 30px;\">p = <strong>2,547 hits during the peak hour<\/strong><\/p>\n<p>Or, between 42 and 43 sessions\u00a0per minute.<\/p>\n<p>If you want more information on how or why this works, read on&#8230;<\/p>\n<p>&nbsp;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"distribution-curve\"><\/span>Distribution Curve<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>A distribution curve is a way to generate, or guess, the points of a histogram based on certain known values &#8211; for example, knowing the total number of data points allows a distribution curve to plot how those data points are distributed among multiple slices or buckets, according to a formula.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"histogram\"><\/span>Histogram<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><em>&#8220;Slow down!\u00a0 What in the world is a histogram???&#8221;<\/em><\/p>\n<p>Fair enough!<\/p>\n<p><strong>A histogram counts the number of data points that fall within each slice of a specific range of values.<\/strong><\/p>\n<p>For example, if you go to the grocery store, there might be 10 checkout lanes.\u00a0 If you plot the lane number across the bottom (X-axis) of a graph, and then count the number of people in each lane, plotting that number as the Y coordinate corresponding to that lane&#8217;s X coordinate, you have a histogram!<\/p>\n<p>In this example, you have 10 lanes (or buckets), and you&#8217;ve counted the number of people in each lane.<\/p>\n<p>When performing capacity planning, histograms are often used to count the number of hits for each hour of the day &#8211; e.g. the buckets are 0 (12:00 AM) to 23 (11:00 PM), and a hit gets counted in to one of the 24 buckets based on the time of day in which the &#8220;hit&#8221; (request) is submitted to the server.<\/p>\n<p>This type of graph helps answer the questions:<\/p>\n<ul>\n<li>What time of day is my system being used most often?<\/li>\n<li>When can I safely perform system maintenance, disrupting the fewest customers?<\/li>\n<li>What is my peak resource demand (utilization)?<\/li>\n<\/ul>\n<p>Creating a usage histogram requires detailed log analysis, and there are plenty of commercial and open source analysis tools that can create one.<\/p>\n<p>The problem with this approach is that you need usage data to create the histogram!\u00a0 What if you&#8217;re building a new service, and there is no data available?<\/p>\n<p><strong>A distribution curve can help you project what the histogram will look like<\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"distribution-patterns\"><\/span>Distribution Patterns<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>There are various types of distribution patterns.<\/p>\n<p>&nbsp;<\/p>\n<h4><span class=\"ez-toc-section\" id=\"bell-curve\"><\/span>Bell Curve<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>Statisticians discuss a bell-shaped distribution curve. \u00a0This curve appears when analyzing how data points relate to the average of all data points within the set.<\/p>\n<p>This sounds really complicated, but let&#8217;s walk through it&#8230;<\/p>\n<p>Let&#8217;s say that you ask 100 people their age. \u00a0Let&#8217;s say that the minimum age is 10, and the maximum age is 80. \u00a0Statistically, the average is going to be about 35. \u00a0If you plot a histogram based on the distance (or difference) between each data point (person&#8217;s age) and 35, most data points will cluster at the 0 mark (right at 35) and will drop off in an inverse curve to the left and right of 0 (the data point labeled &#8220;35&#8221;). \u00a0At the outer edges, there will be a few data points in the low age range to the left, and a few data point in the high age range to the right.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1589\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Bell-Curve_Annotated.png\" alt=\"Bell Curve_Annotated\" width=\"600\" height=\"224\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Bell-Curve_Annotated.png 800w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Bell-Curve_Annotated-300x112.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>This shape is known as the bell-shaped curve, where 90% of all data points fall within the center 50% of the entire range of data values.<\/p>\n<p>This is also the basis of the famous 80\/20 rule, where 80% of data points within the center 20% of the range of data values, while the remaining 20% consume the remaining 80% of the range.<\/p>\n<p>&nbsp;<\/p>\n<h4><span class=\"ez-toc-section\" id=\"linear-distribution\"><\/span>Linear Distribution<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>A linear distribution is literally a &#8220;line&#8221;.<\/p>\n<p>The formula for a line is:<\/p>\n<p style=\"padding-left: 30px;\"><strong>y=m * x + b<\/strong><\/p>\n<ul>\n<li>X is the independent variable<\/li>\n<li>M is the slope of the line (y\/x)<\/li>\n<li>B is the Y intercept &#8211; the value of Y, where x=0.<\/li>\n<\/ul>\n<p>Linear distributions could be flat, meaning every Y value is the same. \u00a0What this means in the real world is that every time bucket has the same number of hits, or that traffic is constant around the clock. \u00a0This never happens. \u00a0Every application has peak and off-peak utilization.<\/p>\n<p>Linear distributions could be a fixed slope, meaning that traffic progressively increases or decreases at a constant rate (or maybe a combination of the two) until the cycle resets at a specific clock time. \u00a0Again, this never happens.<\/p>\n<p>Linear distribution is easy to calculate, but not very useful.<\/p>\n<p>&nbsp;<\/p>\n<h4><span class=\"ez-toc-section\" id=\"cosine-curve\"><\/span>Cosine Curve<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>A cosine curve is based on the cosine formula for a right-triangle:<\/p>\n<p style=\"text-align: left; padding-left: 30px;\">COS = Opposite \/ Hypotenuse<\/p>\n<p>To find \u00a0the height for a given x coordinate, you must supply an angle.<\/p>\n<p>In degrees, the angle is specified in a range of 0 (north) through 180 (south) to 360 (or back to 0).<\/p>\n<p>In radians, the angle is specified in a range of 0 (north) through pi (south) to 2 * pi (or back to 0).<\/p>\n<p>By pinning the angle to the x coordinate, you can generate a curve.<\/p>\n<p>Using the time buckets example, if we map 360 degrees to 24 time buckets, that&#8217;s 15 degrees per hour.<\/p>\n<p>Given some peak magnitude, p, the height of the curve\u00a0at a specific time t is:<\/p>\n<p style=\"padding-left: 30px;\">y = COS (t * 15) * p<\/p>\n<p>This produces a familiar shape&#8230;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-1591\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Cosine_Curve-1024x512.png\" alt=\"Cosine_Curve\" width=\"627\" height=\"314\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Cosine_Curve-1024x512.png 1024w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Cosine_Curve-300x150.png 300w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/Cosine_Curve.png 1200w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><\/p>\n<p>In fact, the bell curve can be approximated using a cosine curve.<\/p>\n<p>Most real-world usage patterns can be approximated using a cosine curve.<\/p>\n<p>&nbsp;<\/p>\n<h4><span class=\"ez-toc-section\" id=\"elliptical-distribution\"><\/span>Elliptical Distribution<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>An ellipse is a stretched-out circle.<\/p>\n<p>Using half of an ellipse, you can generate a very good approximation of real-world usage, using the much simpler formulas that describe a circle.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"generating-an-elliptical-distribution\"><\/span>Generating an Elliptical Distribution<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Given a total daily volume of server hits, V, we can quickly approximate the peak volume, p, as well as the volume y, at any given time, t.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"calculate-peak-volume\"><\/span>Calculate Peak Volume<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The area of a circle is described by the following formula:<\/p>\n<p style=\"padding-left: 30px;\">A = pi * r^2<\/p>\n<ul>\n<li>R\u00a0is the radius of the circle.<\/li>\n<li>PI is the constant, pi (3.14 etc&#8230;)<\/li>\n<\/ul>\n<p>An ellipse is a distorted circle, where the vertical and horizontal radii are different:<\/p>\n<p style=\"padding-left: 30px;\">A = pi * a * b<\/p>\n<ul>\n<li>A is the vertical radius<\/li>\n<li>B is the horizontal radius<\/li>\n<\/ul>\n<p>Let&#8217;s map some variables to our formula:<\/p>\n<ul>\n<li>V, our total volume, is really 1\/2 the area of the ellipse, or A\/2. \u00a0We don&#8217;t want our distribution to include the part of the ellipse that falls below the zero line.<\/li>\n<li>T\u00a0is our timeline, the number of time buckets where we expect the utilization to occur, which would be the entire horizontal diameter. \u00a0We only care about 1\/2 the timeline, or t\/2 as the horizontal radius B<\/li>\n<li>A is the vertical radius, which is actually the peak volume, p, which is what we&#8217;re trying to find.<\/li>\n<\/ul>\n<p>Putting it all together:<\/p>\n<p style=\"padding-left: 30px;\">V = A\/2<\/p>\n<p style=\"padding-left: 30px;\">A = pi * p * t\/2<\/p>\n<p style=\"padding-left: 30px;\"><em>therefore,<\/em><\/p>\n<p style=\"padding-left: 30px;\">V = (pi * p * t\/2) \/2<\/p>\n<p style=\"padding-left: 30px;\">V = pi * p * t \/4<\/p>\n<p style=\"padding-left: 30px;\"><em>Solving for p:<\/em><\/p>\n<p style=\"padding-left: 30px;\">p * pi * t = 4 * V<\/p>\n<p style=\"padding-left: 30px;\"><strong>p = (4 * V) \/ (pi * t)<\/strong><\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"defining-the-timeline\"><\/span>Defining the Timeline<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The timeline is the series of time buckets included in the analysis.<\/p>\n<p>The ellipse will return a 0 value at the beginning, where it crosses the x axis (x intercept), and a second 0 value at the end, which is the second x intercept.<\/p>\n<p>This means that you need to include the time bucket immediately preceding the first period of utilization, as well as the time bucket immediately following the last period of utilization.<\/p>\n<p>Let&#8217;s say for example that the application in question will be primarily used from 8 AM to 8 PM.\u00a0 What this really means is that there will already be some utilization right at 8 AM, so the 8 AM time bucket can&#8217;t be 0.\u00a0 The timeline has to start at 7 AM (the first x intercept).<\/p>\n<p>At the end of the day, we&#8217;re saying that NO ONE is on the system AFTER 8 PM, reflecting that the last period for utilization is 7:00 PM through 7:59 PM &#8211; all of this sits in the 7:00 PM bucket.\u00a0 The 8 PM bucket should be empty, which is our second x intercept.<\/p>\n<p>We want to count the buckets FROM one intercept TO the other.\u00a0 We have 12 time buckets with actual data (8 AM &#8211;&gt; 7 pM ), and we need to include one more to account for the fact that there is not zero traffic at 8 AM.<\/p>\n<p>So in this\u00a0 example, if we say we&#8217;re interested in traffic from 8 AM to 8 PM, we need to start our timeline at 7 AM, and our timeline has 13 buckets.<\/p>\n<p>The easiest rule of thumb is to include an empty time bucket to the left of your timeline.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"example-1-peak-volume\"><\/span>Example 1:\u00a0 Peak Volume<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Let&#8217;s continue with our example of 10,000 hits per day.<\/p>\n<p>Let&#8217;s say that most traffic is expected to hit the servers between 8 AM and 8 PM, or a total of 12 hours.<\/p>\n<p>We want to calculate the peak volume, which should occur at noon if we assume an elliptical distribution.<\/p>\n<p style=\"padding-left: 30px;\">V = 10,000 hits<\/p>\n<p style=\"padding-left: 30px;\">t = 12 time slots + 1 extra = 13 total<\/p>\n<p style=\"padding-left: 30px;\">p = (4 * V) \/ (pi * t)<\/p>\n<p style=\"padding-left: 30px;\">p = (4 * 10,000) \/ (3.1416 * 13)<\/p>\n<p style=\"padding-left: 30px;\">p = 40,000 \/ 40.84<\/p>\n<p style=\"padding-left: 30px;\"><strong>p = 980 hits during the noon hour, or 16 to 17 hits per minute.<\/strong><\/p>\n<p>If each session requires 15k of RAM, and is expected to last, on average, 2 hours, then we need to account for slightly more than double the number of peak sessions, which works out to about 29 meg of RAM required.<\/p>\n<p>Again, assuming that due to the long, average session length, you&#8217;ll have 100% carryover sessions between 11 AM and noon, you have to provide resources for 1,960 sessions.<\/p>\n<p>Let&#8217;s say that your app is really CPU-intensive, and each session takes 1.2% CPU. \u00a0That&#8217;s 2,352% &#8212; meaning that you need roughly 24 CPU cores total, to support your application. \u00a0This might be a single machine with 4 physical CPUs and 6 cores per socket (24 cores) or it might be 6 virtual machines with 4 virtual CPUs per instance, or some other combination that makes sense.<\/p>\n<p>Now we have a sizing baseline for our application!<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"estimating-volume-at-a-given-time-slot\"><\/span>Estimating Volume at a Given Time Slot<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Beyond calculating peak volume, we can use the same distribution curve to figure out what our volume v will be at any given time, t1.<\/p>\n<p>The formula for a circle is:<\/p>\n<p style=\"padding-left: 30px;\">r^2 = x^2 + y^2<\/p>\n<ul>\n<li>R is the fixed radius of the circle<\/li>\n<li>X is the location left or right of the origin (coordinate 0,0)<\/li>\n<li>Y is the location above or below the origin<\/li>\n<\/ul>\n<p>Recall that an ellipse is a distorted circle:<\/p>\n<p style=\"padding-left: 30px;\">( x^2 \/ b^2) + (y^2 \/ a^2) = 1<\/p>\n<ul>\n<li>A is the vertical radius<\/li>\n<li>B is the horizontal radius<\/li>\n<\/ul>\n<p>Solving for y:<\/p>\n<p style=\"padding-left: 30px;\">(y^2 \/ a^2) = 1 &#8211; (x^2 \/ b^2)<\/p>\n<p style=\"padding-left: 30px;\">y^2 = a^2 (1 &#8211; x^2 \/ b^2 )<\/p>\n<p style=\"padding-left: 30px;\">For simplicity, let&#8217;s skip the associative distribution of a^2<\/p>\n<p style=\"padding-left: 30px;\">y = SQRT( a^2 (1 &#8211; x^2 \/ b^2 ) )<\/p>\n<p>Let&#8217;s substitute some variables:<\/p>\n<ul>\n<li>Y is our volume v at a specific time.<\/li>\n<li>A is equal to our peak volume, p<\/li>\n<li>B is equal to 1\/2 * t (the length of our timeline)<\/li>\n<li>X is a bit tricky.\n<ul>\n<li>A specific time bucket t1 is the clock time ct1 minus the beginning of the timeline, ct0. \u00a0We said that our timeline runs from 8 AM to 8\u00a0PM, but we added an extra bucket at 7 AM, so we would take a specific time ct1, and subtract the start of our timeline, 7 AM (ct0), to get the time bucket t1 (number of hours since 7 AM)<\/li>\n<li>The horizontal center of our ellipse is t\/2, so we have to subtract t\/2<\/li>\n<li>x = HOURS(ct1 &#8211; ct0) &#8211; t\/2<\/li>\n<\/ul>\n<\/li>\n<li>SQRT is the &#8220;square root&#8221; function<\/li>\n<li>HOURS is an imaginary function that converts a clock time in to an integer number of hours &#8212; added for simplicity and clarity<\/li>\n<\/ul>\n<p>Putting it all together:<\/p>\n<p style=\"padding-left: 30px;\"><strong>x = HOURS( ct1 &#8211; ct0 ) &#8211; t\/2<\/strong><\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( p^2 (1 &#8211; ( x^2 \/ (t\/2)^2 ) ) )<\/p>\n<p style=\"padding-left: 30px;\"><strong>v = SQRT( p^2 * ( 1 &#8211; ( 4 * x^2 \/ t^2 ) ) )<\/strong><\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"example-2-estimating-volume-at-2-pm\"><\/span>Example 2: Estimating Volume at\u00a02 PM<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li>ct1 is 2 PM.<\/li>\n<li>Our timeline starts at 7 AM (ct0)<\/li>\n<li>p (from our previous example) is 980<\/li>\n<li>t, our timeline is 13 timeslots<\/li>\n<li>v is the volume at ct1, for which we&#8217;re trying to solve<\/li>\n<\/ul>\n<p>Calculate X:<\/p>\n<p style=\"padding-left: 30px;\">x = HOURS( ct1 &#8211; ct0 ) &#8211; t\/2<\/p>\n<p style=\"padding-left: 30px;\">x = HOURS( 2PM &#8211; 7AM ) &#8211; 13\/2<\/p>\n<p style=\"padding-left: 30px;\">x = HOURS( 7 ) &#8211; 6.5<\/p>\n<p style=\"padding-left: 30px;\">x = 7 &#8211; 6.5<\/p>\n<p style=\"padding-left: 30px;\"><strong>x = 0.5<\/strong><\/p>\n<p>Calculate V:<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( p^2 * ( 1 &#8211; ( 4 * x^2 \/ t^2 ) ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 980^2 * (1 &#8211; (4 * 0.5^2 \/ 13^2 ) ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 960,400 * (1 &#8211; ( 0.25 \/169 ) ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 960,400 * (1 &#8211;\u00a0 0.0015)<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 960,400 * 0.9985)<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 958,959.4 )<\/p>\n<p style=\"padding-left: 30px;\"><strong>v = 979<\/strong><\/p>\n<p>This verifies our calculation &#8211; at the middle of the curve, our volume matches the peak volume (off by a little due to the integer nature of slicing the curve up in to segments, plus rounding error).<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"example-3-estimating-volume-at-5-pm\"><\/span>Example 3:\u00a0 Estimating Volume at 5 PM<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li>ct1 is 5 PM.<\/li>\n<li>Our timeline starts at 7 AM (ct0)<\/li>\n<li>p (from our previous example) is 980<\/li>\n<li>t, our timeline is 13 timeslots<\/li>\n<li>v is the volume at ct1, for which we&#8217;re trying to solve<\/li>\n<\/ul>\n<p>Calculate X:<\/p>\n<p style=\"padding-left: 30px;\">x = HOURS( ct1 &#8211; ct0 ) &#8211; t\/2<\/p>\n<p style=\"padding-left: 30px;\">x = HOURS( 5PM &#8211; 7AM ) &#8211; 13\/2<\/p>\n<p style=\"padding-left: 30px;\">x = HOURS( 10 ) &#8211; 6.5<\/p>\n<p style=\"padding-left: 30px;\">x = 10 &#8211; 6.5<\/p>\n<p style=\"padding-left: 30px;\"><strong>x = 3.5<\/strong><\/p>\n<p>Calculate V:<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( p^2 * ( 1 &#8211; ( 4 * x^2 \/ t^2 ) ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 980^2 * (1 &#8211; (4 * 3.5^2 \/ 13^2 ) ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 960,400 * (1 &#8211; ( 12.25\/169 ) ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 960,400 * (1 &#8211; 0.0725 ) )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 960,400 * 0.9275 )<\/p>\n<p style=\"padding-left: 30px;\">v = SQRT( 890,771 )<\/p>\n<p style=\"padding-left: 30px;\"><strong>v = 944<\/strong><\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"analysis\"><\/span>Analysis<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Given the examples above, here is a graph covering 7 AM to 8 PM<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-1582\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_2-1024x512.png\" alt=\"EllipticalDistribution_2\" width=\"627\" height=\"314\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_2-1024x512.png 1024w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_2-300x150.png 300w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_2.png 1200w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><\/p>\n<p>Adding up each bar value, the total is 9,775 &#8211; not quite the total 10,000.\u00a0 The reason for the discrepancy is because we are making 13 discrete slices, of what is really a fluid curve.\u00a0 If we increase the number of slices, we will eventually approach our total hit count of 10,000.<\/p>\n<p>This approach assumes that people jump in right at 8 AM to start using the system, and no new sessions come in after 8 PM.\u00a0 In reality, you could add some margin to the timeline &#8211; for example, 15 minutes on each end, to account for early birds and stragglers.<\/p>\n<p><strong>t = 13.5<\/strong><\/p>\n<p>We have 12 time buckets, plus 1 extra time bucket, plus we&#8217;re adding 1\/2 hour to the timeline.<\/p>\n<p>If we do this, we&#8217;re moving the start of the graph BACK 15 minutes, so we need to add that back to X, to keep the center at 0:<\/p>\n<p>t1 = HOUR( cp1 &#8211; cp0 ) &#8211; t\/2 + <strong>0.25<\/strong><\/p>\n<p>The 15 minute offset shifts the traffic 15 minutes to the left.<\/p>\n<p>With the slightly longer timeline, but the same amount of volume, recalculating p with a timeline of 13.5 instead of 13 results in:<\/p>\n<p>p = <strong>943<\/strong><\/p>\n<p>Here is what the hit graph looks like, if users start accessing the system at 7:45 and new sessions end at 8:15:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-1583\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_3-1024x512.png\" alt=\"EllipticalDistribution_3\" width=\"627\" height=\"314\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_3-1024x512.png 1024w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_3-300x150.png 300w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_3.png 1200w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><\/p>\n<p>This graph looks much more realistic, and the total number of hits is 10,093 &#8212;\u00a0 much closer than the original hit count.<\/p>\n<p>Keep in mind &#8211; this is the hit count, which equates to when the user logs in to BEGIN their session.\u00a0 If we said that each session lasts about 2 hours, then each bar on the utilization graph needs to account for sessions that started in the previous 1 hour time frame:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-1584\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_4-1024x512.png\" alt=\"EllipticalDistribution_4\" width=\"627\" height=\"314\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_4-1024x512.png 1024w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_4-300x150.png 300w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_4.png 1200w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><\/p>\n<p>Meaning, you&#8217;ll have active <em>sessions<\/em> until the last user logs out, just after 9 PM.<\/p>\n<p>However, if your application is something that users check infrequently throughout the day, a session might last 10 minutes, meaning that your utilization (session count) would be about 1\/6 of the hit count:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-1585\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_5-1024x512.png\" alt=\"EllipticalDistribution_5\" width=\"627\" height=\"314\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_5-1024x512.png 1024w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_5-300x150.png 300w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_5.png 1200w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><\/p>\n<p>Let&#8217;s say that people tend to use the application closer to end of day, say closer to 5 PM.\u00a0 By modifying X, we can create skewed ellipse.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-1593\" src=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_6-1024x512.png\" alt=\"EllipticalDistribution_6\" width=\"627\" height=\"314\" srcset=\"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_6-1024x512.png 1024w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_6-300x150.png 300w, https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-content\/uploads\/EllipticalDistribution_6.png 1200w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><\/p>\n<p>This looks very similar to our cosine curve!<\/p>\n<p>In the graph above, we used the exact same formula for v (the height of each bar), but modified x as follows:<\/p>\n<p style=\"padding-left: 30px;\">x1 = HOURS(ct1 &#8211; ct0)<\/p>\n<p>The old formula would look like this:<\/p>\n<p style=\"padding-left: 30px;\">x = x1 &#8211; t\/2 + 0.25<\/p>\n<p>We create exponential growth using x1 compared to t:<\/p>\n<p style=\"padding-left: 30px;\">x = <strong>x1^2 \/ t<\/strong> &#8211; t\/2 + 0.25<\/p>\n<p>Since x1 approaches t, this creates an inverse-squared relationship, condensing or skewing the data points toward the right side of the graph.<\/p>\n<p>The exponent 2 can be used to control the rate of skewing.\u00a0\u00a0 x1^2.5 \/ t^1.5, for example produces an even more skewed curve.\u00a0 Note that we had to raise t to the power of 1.5 &#8211; this keeps the resulting number, although shifted, within the range of the original timeline&#8217;s x values.<\/p>\n<p>Also, if you add up the number of bars, the total is only 9,117 &#8211; nearly 1,000 data points have been &#8220;compressed out&#8221; of our curve!\u00a0 The easiest way to compensate for this is to multiply p (our peak volume guestimate) by some fixed number, such as 1.1.\u00a0 If you were to do this, p becomes 1,037, and the sum of the bars equals 10,029 &#8211; right back in the correct ball park.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"summary\"><\/span>Summary<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The best way to predict usage and resources is to look at actual data.<\/p>\n<p>However, if you don&#8217;t have live data, because the system is new, or no historic data is available, the best way to predict utilization and resources is to use a <span style=\"text-decoration: underline;\"><strong>distribution curve<\/strong><\/span>.<\/p>\n<p>Distribution curves can help you predict:<\/p>\n<ul>\n<li>Utiliation in terms of concurrent sessions<\/li>\n<li>Logins for a given time period<\/li>\n<li>Peak utilization<\/li>\n<li>Peak computing resources required<\/li>\n<li>Peak bandwidth required<\/li>\n<li>Optimum maintenance windows<\/li>\n<\/ul>\n<p>Elliptical distribution curves use simple formulas, are easy to understand and manipulate, and can be mapped to real-world usage scenarios.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to guestimate peak volume, and volume at any arbitrary time using total volume with an elliptical distribution curve. Someone says, &#8220;we have 10,000 hits per day on our website&#8221;, but what does that mean from an instantaneous demand standpoint? A distribution curve can help you figure that out. &nbsp;<\/p>\n","protected":false},"author":16,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-1567","post","type-post","status-publish","format-standard","hentry","category-analyses-and-responses"],"_links":{"self":[{"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/posts\/1567","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/comments?post=1567"}],"version-history":[{"count":10,"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/posts\/1567\/revisions"}],"predecessor-version":[{"id":1598,"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/posts\/1567\/revisions\/1598"}],"wp:attachment":[{"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/media?parent=1567"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/categories?post=1567"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/justinparrtech.com\/JustinParr-Tech\/wp-json\/wp\/v2\/tags?post=1567"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}