This is the fourth article in an occasional series of articles about Octave, a free open-source numerical programming environment that is mostly compatible with MATLAB. This series began with the article Octave: An Alternative to the High Cost of MATLAB. This article discusses plotting and graphics in Octave. Octave has extensive plotting and graphics features including two-dimensional plots, histograms, three dimensional plots, and a range of specialized plots and charts such as pie charts. This article gives an overview of the key plotting and graphics features of Octave, discusses a few gotchas, and gives several illustrative examples.
Octave can plot functions and data using the built-in plot functio. To illustrate thiese features, this article uses data on the price of gasoline in the United States from the United States Energy Information Administration, part of the US Department of Energy. The data is taken from the Motor Gasoline Retail Prices, U.S. City Average report released March 29, 2011. This data is currently available in Adobe Acrobat PDF file format, a comma separated value (CSV) ASCII text format, and Microsoft Excel (XLS) spreadsheet format. The data from the CSV report was imported into Microsoft Excel, exported as an ASCII tab delimited text fiel and then cut and pasted into two tab-delimited ASCII files. The EIA gas price report contains several time series in a single file (leaded gas prices, unleaded gas prices, and several more). The leaded and unleaded gas prices were extracted manually in the Notepad++ text editor. These dataq files were named: leaded.txt and unleaded.txt.
unleaded = dlmread('unleaded.txt');
The plot of unleaded gasonline prices above was generated using the following Octave code:
unleaded = dlmread('unleaded.txt'); unleaded_rawtime = unleaded(:,2); unleaded_rawprice = unleaded(:,3); unleaded_good_index = find(unleaded_rawprice < 10); unleaded_time_x = unleaded_rawtime(unleaded_good_index); unleaded_price = unleaded_rawprice(unleaded_good_index); unleaded_time = fix(unleaded_time_x/100) + rem(unleaded_time_x,100)*(1./12.); plot(unleaded_time, unleaded_price); title('Unleaded Regular Gas Price'); xlabel('Year'); ylabel('US Dollars Per Gallon'); print('unleaded.jpg');
A few comments may be helpful. dlmread is an Octave function that reads ASCII data files. It is fairly flexible and can often automatically identify the separator used in ASCII data files such as the tab or a comma. If necessary, the user can explicitly specify the separator and other parameters of the data file. Nonetheless it is common to encounter data files with various quirks. For example, the EIA gas price report contains several time series in a single file. There are many months for which either data is not available or not reported; these are indicated by a value of 10000000 for the gas price. The code above uses the Octave find function to select the valid data. Further the year and month are combined in the format YYYYMM sso January 1970 would be “197001”, February 1970 is “197002”, and so forth. Used directly in the Octave plot function, this will produce a nonsense plot that is not useful.. Thus, the example code above uses Octave fix and mod functions to compute a time in years (month numbers are converted to fractions of a calendar year). Octave has numerous advanced functions such as find , fix, rem, and so forth that can be used to clean up and reformat data as needed.
The Octave function plot handles the actually plotting of the graph. The Octae plot function is a very versatile plotting function for two dimensional data such as time series. Both the Octave user manual and the build in help (help plot) provide detailed information on the use of plot.
Octave has built in support for histograms. A histogram is a way of displaying the frequency of occurrence of data or events. One might, for example, be interested in how often gas prices change by one percent, two percent, or ten percent in one month.
By default, the Octave hist function creates a histogram with ten bins, which is often not very useful. One can specify more bins easily.
Probability density functions are normalized ito unity(1.0). With Octave one can easily generate histograms normalized to 1.0.
Now the values in the histogram of percent changes in gas prices are normalized to 1.0. The histogram is an estimate of the probability density function for gas price changes. One might wonder whether this distribution is the Gaussian probability denisty, also known as the Normal or Bell Curve distribution. In fact, the histogram looks narrower than a typical Gaussian and also has some outliers, a long tail, which are not typical of a true Gaussian distribution. The next two figures show the Gaussian probability density function and a histogram of synthetic data generated with Gaussian statistics using the same mean and variance as the actual gas price data.
It is easy to see that the Gaussian, or Nornal or Bell Curve, distribution differs from the distribution of the gas price data. The actual data has a narrower, sharper peak and long tails. Every now and then, gas prices jump shaprly. This is a pattern seen in many financial and other kinds of assets . Many popular financial models such as the Black-Scholes option piricing model assume Gaussian or near-Gaussian distributions which generally understates the risks when a fniancial asset or commodity has long non-Gaussian tails as gasoline does.
The histograms and related plots above were generated using the following Octave code:
change = conv(unleaded_price, [-1 1]); len = length(unleaded_price); returns = change(2:len) ./ unleaded_price(2:len); returns = 100.0 * returns; % convert to percent plot(unleaded_time(2:len), returns, 'g;return;'); title('Returns on Gas'); xlabel('Year'); ylabel('Return (Percent)'); print('gas_returns.jpg'); printf('making plot 2\n'); fflush(stdout); figure(2) hist(returns); title('hist(returns)'); print('hist_returns.jpg'); xlabel('Return (Percent)'); ylabel('Counts'); printf('making plot 3\n'); fflush(stdout); figure(3) hist(returns, 100); title('hist(returns, 100) (100 BINS)'); xlabel('Return (Percent)'); ylabel('Counts'); print('hist100_returns.jpg'); printf('making plot 4\n'); fflush(stdout); figure(4) hist(returns, 100, 1.0); title('Normalized Returns'); xlabel('Return (Percent)'); ylabel('Normalized Counts'); print('hist_norm_returns.jpg'); printf('making plot 5\n'); fflush(stdout); figure(5) m = mean(returns); sigma = std(returns); x = (-100:100); plot(x, mygauss(x, m, sigma)); title('Gaussian with Same Mean/Standard Deviation'); xlabel('Return (Percent)'); ylabel('Probability'); print('gauss_model.jpg'); % let user know processing is done % in case figure does not pop in front k_returns = kurtosis(returns); check = sigma*randn(1, length(returns)) + m; k_check = kurtosis(check); m_check = mean(check); sigma_check = std(check); printf('making plot 6\n'); fflush(stdout); figure(6) hist(check, 100, 1.0); title('Histogram of Gaussian Model'); xlabel('Percent Chnage'); ylabel('Counts'); print('hist_gauss.jpg'); printf('ALL DONE'); fflush(stdout); beep()
and the mygauss function which implements the Gaussian (Normal/Bell Curve) probability density function:
function [result] = mygauss(x, mean, sigma) norm = 1.0/(sqrt(2*pi) * sigma); exponent = -1.0*((x - mean) ./ sigma).^2; result = norm * exp(exponent); end
Octave supports most common special plots such as stairs plots, bar charts, pie charts, and so forth.
This is a stairs plot of the gas price time series data for the first few years (notice the steps or stairs in the plot). THis is generated using the Octave stairs function
This is the so-called stem plot (Octave stem function):
This is a bar chart (Octave bar function). On Windows, the bar function seems to have trouble with large amounts of data unlike the other plotting functions.:
Horizontal Bar Chart
This is a horizontal bar chart (Octave barh function). On Windows, the bar function seems to have trouble with large amounts of data unlike the other plotting functions.
Full Source Code
The plots above were generated using the following Octave code:
% regular plot figure(1) plot(unleaded_time(1:36), unleaded_price(1:36)); title('plot(time, price) Regular Plot'); xlabel('Year'); ylabel('Price'); print('regular_plot.jpg'); % figure(2) stairs(unleaded_time(1:36), unleaded_price(1:36)); title('stairs(time, price) Stairs Plot'); xlabel('Year'); ylabel('Price'); print('stairs_plot.jpg'); % figure(3) stem(unleaded_time(1:36), unleaded_price(1:36)); title('stem(time, price) Stem Plot'); xlabel('Year'); ylabel('Price'); print('stem_plot.jpg'); % figure(4) bar(unleaded_time(1:10), unleaded_price(1:10)); title('bar(time, price) Bar Chart'); xlabel('Year'); ylabel('Price'); print('bar_plot.jpg'); figure(5) barh(unleaded_price(1:10)); title('barh(price) Horizontal Bar Chart'); xlabel('Year'); ylabel('Price'); print('barh_plot.jpg'); beep();
Octave can generate standard pie charts using the pie function:
This is the Octave code for the pie chart above (NOTE the use of the cellstr(‘country name’) syntax — this is needed);
% allegedly proven oil reserves pie chart %Venezuela 297 %Saudi Arabia 267 %Canada 179 %Iraq 143 %Iran 138 %Kuwait 104 reserves = [ 297, 267, 179, 143, 138, 104 ]; names = [ cellstr('Venezuela (297)'), cellstr('Saudi Arabia (267)'), cellstr('Canada (179)'), cellstr('Iraq (153)'), cellstr('Iran (138)'), cellstr('Kuwait (104)') ]; pie(reserves, names, [1 0 0 0 0 0]); title('Pie Chart of Six Larges Oil Nations'); xlabel('Billions of Barrels of Oil'); print('pie_reserves.jpg');
The third argument to pie (after names ) tells Octave to “explode” the first wedge (Venezuela).
Three Dimensional Graphics
Octave has functions to plot an display three dimensional data and functions:
The two 3D plots above were generated using the built-in peaksand sombrero test functions. It is possible to compute and display almost any 3D surface using the meshgrid function and 3D display functions such as plot3, surf, mesh, contour, and quiver.
For people coming from another type of programming such as the C family of langauges, the meshgrid concept and function is new and may take a little getting used to. A meshgrid is basically a very simple concept which is also very powerful. Octave represents almost everything as a “matrix” or multi-dimensional array. This is the source of much of the power of Octave. One can often avoid explicitly coding loops over the elements of the Octave matrix. This speeds development and reduces errors.
A meshgrid is a two (or higher) dimenaional array in which the elements of the array are the spatial location (x or y coordinate usually) of the associated element of a spatial grid (the mesh grid). Here is some Octave code which explicitly computes and plots the sombrero function:
ticks = [-10.0:0.5:10.0]; [x, y] = meshgrid(ticks, ticks); z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2)); surf(x,y,z); title('sombrero using meshgrid'); print('sombrero_meshgrid.jpg');
In this example, the Octave function meshgrid returns two arrays x and y which contin the x and y corrdinates respectively for the mesh grid elements. In this case, the x and y positions of the grid are specified by the one dimensional ticks array. The ticks run from -10 to 10.0 in steps of 0.5., a total of 41 ticks. The x array generated by meshgrid is a 41 by 51 array with the x coordinate of each element. The y array is a 41 by 41 element array with the y coordinate of each element.
The meshgrid enables one to express 3D surfaces or functions in Octave in a simple intuitive compact way:
z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2));
In this example, z is a two dimensional array (matrix) with the function value at the xand y coordinates specified by each grid point in the arrays xand y. Then, one can display the surface using display functions such as surf, mesh, and so forth. It usually takes some practice to get used to using meshgrid if one is not familiar with the concept.
Octave can create contour plots using the contourfunction.
Vector Field Plot
Octave can create vector field plots using the quiver (as in quiver of arrows) function:
These 3D plots were generated with the following Octave code:
% 3d graphics % tests figure(1); sombrero(); print('sombrero.jpg'); pause(1); figure(2); peaks(); print('peaks.jpg'); % meshgrid figure(3); ticks = [-10.0:0.5:10.0]; [x, y] = meshgrid(ticks, ticks); z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2)); surf(x,y,z); title('sombrero using meshgrid'); print('sombrero_meshgrid.jpg'); % plot3 figure(4); ticks = [-10.0:0.5:10.0]; [x, y] = meshgrid(ticks, ticks); z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2)); plot3(x,y,z); title('sombrero using plot3'); print('sombrero_plot3.jpg'); % mesh figure(5); ticks = [-10.0:0.5:10.0]; [x, y] = meshgrid(ticks, ticks); z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2)); mesh(x,y,z); title('sombrero using mesh'); print('sombrero_mesh.jpg'); % contour figure(6); ticks = [-10.0:0.5:10.0]; [x, y] = meshgrid(ticks, ticks); z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2)); contour(x,y,z); title('sombrero using contour'); print('sombrero_contour.jpg'); % quiver figure(7); ticks = [-10.0:0.5:10.0]; [x, y] = meshgrid(ticks, ticks); z = sin (sqrt (x.^2 + y.^2)) ./ (sqrt (x.^2 + y.^2)); theta = atan2(y, x); quiver(x,y, z.*cos(theta), z.*sin(theta)); title('sombrero vector field using quiver'); print('sombrero_quiver.jpg'); % all done disp('ALL DONE!'); beep();
Octave has cryptic error messages. These messages almost always correctly identify the line of code that is in error. The verbal descripton of the error is often incoprehensible and may be wrong. The column number reported for the location of an error in the line is often wrong, for example indicating the start of the expression on the right hand side of an assignment statement where the problem is later in the line of code.
If a user cannot spot the error by reading the line of code, a common occurence, it is usually best to convert the line of code into several lines of code with each lnew line of code representing a sub-expression o f the original line of code. This approach will usually narrow the error/bug down to a specific symbol and identify the specific error.
Octave supports both true-matrix operations and element by element (aka element-wise) operations. For example, A*B is true-matrix multiplication if A and B are matrices. A.*B is element by element multiplication in which each element is multiplied by its corresponding element in the other matrix. It is easy to mistakenly use * where one should use .* or .* where one should use * in Octave. Pay close attention to th edistincition between the true matrix and element-by-element operators.
Octave has extensive built in plotting and graphics functions. There are a few weaknesses, notably some problems with the bar chart functions, at least in the Windows version of Octave 3.2.4. Users coming from a different type of programming background such as the C family of languages may need a little time and practice to adjust to the meshgrid concept. The plotting and graphics funsions of Octave are more than adequate for all common scientfic, engineeering, and general analytical tasks, both two and three dimensional.
© 2011 John F. McGowan
About the Author
John F. McGowan, Ph.D. is a software developer, research scientist, and consultant. He works primarily in the area of complex algorithms that embody advanced mathematical and logical concepts, including speech recognition and video compression technologies. He has extensive experience developing software in C, C++, Visual Basic, Mathematica, MATLAB, and many other programming languages. He is probably best known for his AVI Overview, an Internet FAQ (Frequently Asked Questions) on the Microsoft AVI (Audio Video Interleave) file format. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech). He can be reached at [email protected].
Sponsor’s message: Check out Math Better Explained, an insightful ebook and screencast series that will help you see math in a new light and experience more of those awesome “aha!” moments when ideas suddenly click.