bill jelen, mrexcel 800 east 96th street indianapolis, indiana 46240 microsoft excel 2010microsoft excel 2010 in depth copyright 2010 by que publishing all rights reserved. no part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photo-copying, recording, or otherwise, without written permission from the publisher. no patent liability is assumed with respect to the use of the information con-tained herein. although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omis-sions, nor is any liability assumed for damages resulting from the use of the information contained herein. isbn-13: 978-0-789-74308-4 isbn-10: 0-789-74308-6 library of congress cataloging-in-publication data is on file. printed in the united states of america first printing: june 2010 trademarks all terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. que publishing cannot attest to the accuracy of this information. use of a term in this book should not be regarded as affecting the validity of any trademark or service mark. warning and disclaimer every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied. the information provided is on an “as is” basis. the author and the publisher shall have neither liability nor responsibility to any person or entity with respect to any loss or damages aris-ing from the information contained in this book or from the use of the cd or programs accompanying it. bulk sales que publishing offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales. for more information, please contact u.s. corporate and government sales 1-800-382-3419 corpsales@pearsontechgroup.com for sales outside of the u.s., please contact international sales international@pearsoned.com associate publisher greg wiegand acquisitions editor loretta yates managing editor sandra schroeder project editor seth kerney copy editor barbara hacha indexer cheryl lenser proofreaders apostrophe editing services debbie williams technical editor bob umlas publishing coordinator cindy teeters multimedia developer dan scherf interior designer anne jones cover designer anne jones page layout mark shirar jake mcfarland nonie ratcliffdedication to bob jelencontents at a glance introduction 1 i changes in user interface 1 the file menu becomes the backstage view 7 2 the ribbon interface and quick access toolbar 23 3 using other excel interface improvements 39 4 customizing the ribbon 57 5 keyboard shortcuts 65 6 the excel options dialog 87 7 the big grid and file formats 109 ii calculating with excel 8 understanding formulas 123 9 controlling formulas 143 10 understanding functions 165 11 using everyday functions: math, date and time, and text functions 189 12 using powerful functions: logical, lookup, and database functions 277 13 using financial functions 343 14 using statistical functions 389 15 using trig, matrix, and engineering functions 503 16 connecting worksheets, workbooks, and external data 571 17 using super formulas in excel 599 18 using names in excel 619 19 fabulous table intelligence 643 iii business intelligence 20 sorting data 667 21 removing duplicates and filtering 679 22 using automatic subtotals 697 23 using pivot tables to analyze data 711 24 using slicers and filtering a pivot table 731 25 mashing up data with powerpivot 753 26 using what-if, scenario manager, goal seek, and solver 791 27 automating repetitive functions using vba macros 817 28 more tips and tricks for excel 2010 857 29 tour of the best add-ins for excel 871 iv visual presentation of data 30 formatting worksheets 875 31 using data visualizations and conditional formatting 925 32 using excel charts 953 33 using sparklines 983 34 using smartart, shapes, wordart, and text boxes 997 35 using pictures and clip art 1023 v sharing 36 printing 1039 37 excel web app and other ways to share workbooks 1059 38 saving time using the easy-xl program 1079 index 1097contents introduction 1 i changes in user interface 1 the file menu becomes the backstage view 7 understanding “in” versus “out” commands 7 using backstage view 8 pressing the esc key to close backstage view 8 using the four quick commands in the left navigation 8 opening recent files 9 one-click access to recent files 11 recovering unsaved workbooks 13 clearing the recent workbooks list 14 getting information about the current workbook 14 correcting special states such as disabled macros and links 15 excel’s automatic trusting of a document 16 opening a file in the protected view sandbox 16 marking a workbook as final to prevent editing 16 finding hidden content using the document inspector 17 creating a new workbook from a template 18 printing and print preview 18 sharing your workbook using save & send 20 getting updates and help 21 2 the ribbon interface and quick access toolbar 23 using the ribbon 23 using dialog launchers and the 80/20 rule 24 using flyout menus and galleries 25 the ribbon is constantly changing 27 harnessing contextual ribbon tabs 27 resizing excel changes the ribbon 29 solving common ribbon problems 30 you cannot find a particular command on the ribbon 30 you still cannot find the command on the ribbon 30 the ribbon takes up too many rows 31you do not like where something is located on the ribbon 32 you cannot see all your favorite commands at once 32 using the quick access toolbar 32 changing the location of the quick access toolbar 33 adding favorite commands to the quick access toolbar 34 knowing which commands can be on the quick access toolbar 34 removing commands from the quick access toolbar 34 customizing the quick access toolbar 35 using the excel options to customize the quick access toolbar for all workbooks 36 customizing icons for the current workbook only 36 filling up the quick access toolbar 37 rearranging icons on the quick access toolbar 37 resetting the quick access toolbar 37 assigning vba macros to quick access toolbar buttons 37 3 using other excel interface improvements 39 using live preview 40 previewing paste using the paste options gallery 41 accessing the gallery after doing a paste 42 accessing the paste options gallery from the right-click menu 44 accessing the paste options gallery from the paste drop-down 47 using the mini toolbar to format selected text 47 getting the mini toolbar back 49 disabling the mini toolbar 49 expanding the formula bar 50 zooming in and out on a worksheet 51 using the status bar to add numbers 52 switching between normal view, page break preview, and page layout view modes 52 using the new sheet icon to add worksheets 54 dragging a worksheet to a new location 54 inserting a worksheet in the middle of a workbook 55 4 customizing the ribbon 57 performing a simple ribbon modification 57 using a more complex ribbon modification 59 hiding/showing ribbon tabs 62adding a new ribbon tab 62 sharing customizations with others 63 resetting customizations 63 questions about ribbon customization 63 5 keyboard shortcuts 65 using new keyboard accelerators 65 selecting icons on the ribbon 67 selecting options from a gallery 67 navigating within drop-down lists 68 backing up one level through a menu 68 dealing with keyboard accelerator confusion 68 selecting from legacy dialog boxes 68 using the shortcut keys 69 using excel 2003 keyboard accelerators 76 invoking an excel 2003 alt shortcut 77 determining which commands work in legacy mode 78 6 the excel options dialog 87 introducing the excel options dialog 87 getting help with a setting 90 using autorecover options 91 new excel 2010 options for internationalization 91 new excel 2010 options for performance 92 new excel 2010 options for security 94 ten options to consider 97 five excel oddities 98 guide to excel options 99 7 the big grid and file formats 109 excel grid limits 109 why are there only 65,536 rows in my excel 2007 spreadsheet? 110 other limits in excel 2010 111 tips for navigating the big grid 113 using shortcut keys to move around 113 using the end key to navigate 113 using the current range to navigate 114 using go to to navigate 114understanding the new file formats 114 a brief history of file formats 115 using the new binary file format: biff12 116 using the new xml file formats: xlsx and xlsm 116 version compatibility 118 opening excel 2010 files in excel 2002 or 2003 120 minor loss of fidelity 120 significant loss of functionality 121 creating excel 2010 file formats in excel 2003 121 opening excel 2010 files in excel 2007 122 ii calculating with excel 8 understanding formulas 123 getting the most from this chapter 123 introduction to formulas 124 formulas versus values 124 entering your first formula 125 building a formula 125 the relative nature of formulas 126 overriding relative behavior: absolute cell references 127 using mixed references to combine features of relative and absolute references 128 using the f4 key to simplify dollar sign entry 129 three methods of entering formulas 132 enter formulas using the mouse method 133 entering formulas using the arrow-key method 133 entering the same formula in many cells 135 copying a formula by using ctrlenter 135 copying a formula by dragging the fill handle 136 double-click the fill handle to copy a formula 137 use the table tool to copy a formula 138 9 controlling formulas 143 formula operators 143 order of operations 144 stacking multiple parentheses 146understanding error messages in formulas 147 using formulas to join text 149 joining text and a number 150 copying versus cutting a formula 151 automatically formatting formula cells 154 using date math 155 troubleshooting formulas 157 highlighting all formula cells 157 seeing all formulas 158 editing a single formula to show direct precedents 160 using formula auditing arrows 160 tracing dependents 162 using the watch window 162 evaluate a formula in slow motion 162 evaluating part of a formula 163 excel in practice: moving the formula tooltip 164 10 understanding functions 165 working with functions 167 the formulas tab in excel 2010 167 finding the function you need 168 using autocomplete to find functions 169 using the function wizard to find functions 169 getting help with excel functions 170 using in-cell tooltips 170 using the function arguments dialog 171 using excel help 172 using autosum 172 potential problems with autosum 173 special tricks with autosum 174 using the autosum drop-down 175 using the new general-purpose functions in excel 176 like subtotal, but better: aggregate() 176 added in excel 2007: iferror() 177 using conditional formulas with multiple conditions: sumifs(), averageifs(), and countifs() 178functions with new variations in excel 2010 178 calculating multiple mode values 178 calculating workdays 179 handling ties in the rank function 179 calculating percentiles and quartiles 179 calculating ceiling and floor for negative values 179 functions that have been renamed 180 using worksheets with legacy function names 181 cube functions introduced in excel 2007 183 using the former atp functions 184 function reference chapters 186 11 using everyday functions: math, date and time, and text functions 189 examples of math functions 197 using sum to add numbers 198 using aggregate to ignore error cells or filtered rows 199 using count or counta to count numbers or nonblank cells 203 using round, rounddown, roundup, int, trunc, floor, floor.precise, ceiling, ceiling.precise, even, odd, or mround to remove decimals or round numbers 204 using subtotal instead of sum with multiple levels of totals 211 using rand and randbetween to generate random numbers and data 215 using abs() to figure out the magnitude of error 218 using pi to calculate cake or pizza pricing 220 using combin to figure out lottery probability 221 using fact to calculate the permutation of a number 221 using gcd and lcm to perform seventh-grade math 222 using multinomial to solve a coin problem 223 using mod to find the remainder portion of a division problem 224 using quotient to isolate the integer portion in a division problem 225 using product to multiply numbers 226 using sqrt and power to calculate square roots and exponents 227 using sign to determine the sign of a number 228 using countif, averageif, and sumif to conditionally count, average, or sum data 229 using conditional formulas with multiple conditions: sumifs(), averageifs(), and countifs() 231 dates and times in excel 233 understanding excel date and time formats 236examples of date and time functions 239 using now and today to calculate the current data and time or current date 239 using year, month, day, hour, minute, and second to break a date/time apart 240 using date to calculate a date from year, month, and day 240 using time to calculate a time 242 using datevalue to convert text dates to real dates 243 using timevalue to convert text times to real times 244 using weekday to group dates by day of the week 245 using weeknum to group dates into weeks 246 alternative calendar systems and days360 246 using yearfrac or datedif to calculate elapsed time 248 using edate to calculate loan or investment maturity dates 251 using eomonth to calculate the end of the month 252 using workday or networkdays to calculate workdays 253 using international versions of workday or networkdays 255 examples of text functions 256 joining text with the ampersand (&) operator 256 using lower, upper, or proper to convert text case 257 using trim to remove trailing spaces 258 using clean to remove nonprintable characters from text 260 using the char function to generate any character 262 using the code function to learn the character number for any character 262 using left, mid, or right to split text 264 using len to find the number of characters in a text cell 266 using search or find to locate characters in a particular cell 267 using substitute and replace to replace characters 270 using rept to repeat text multiple times 272 using exact to test case 272 using the t and value functions 275 using functions for non-english character sets 275 12 using powerful functions: logical, lookup, and database functions 277 examples of logical functions 283 using the if function to make a decision 283 using the and function to check for two or more conditions 284 using or to check whether any conditions are met 286 using the true and false functions 288 using the not function to simplify the use of and and or 288 using the iferror function to simplify error checking 289examples of information functions 291 using the is functions to test for errors 291 using the isref function 294 using the n function to add a comment to a formula 294 using the na function to force charts to not plot missing data 296 using the info function to print information about a computer 297 using the cell function 298 using type to determine type of cell value 302 examples of lookup and reference functions 302 using the choose function for simple lookups 303 using vlookup with true to find a value based on a range 304 using column to assist with vlookup when filling a wide table 308 using hlookup for horizontal lookup tables 310 using the match function to locate the position of a matching value 312 using index and match for a left lookup 314 using match and index to fill a wide table 316 performing many lookups with lookup 317 using functions to describe the shape of a contiguous reference 318 using areas and index to describe a range with more than one area 320 using numbers with offset to describe a range 322 using address to find the address for any cell 325 using indirect to build and evaluate cell references on-the-fly 326 using the hyperlink function to quickly add hyperlinks 329 using the transpose function to formulaically turn data 330 using the rtd function and com add-ins to retrieve real-time data 332 using getpivotdata to retrieve one cell from a pivot table 332 examples of database functions 334 using dsum to conditionally sum records from a database 335 using the dget function 339 13 using financial functions 343 examples of common household loan and investment functions 348 using pmt to calculate the monthly payment on an automobile loan 348 using rate to determine an interest rate 349 using pv to figure out how much house you can afford 350 using nper to estimate how long a nest egg will last 352 using fv to estimate the future value of a regular savings plan 352 examples of functions for financial professionals 354 using ppmt to calculate the principal payment for any month 354 using ipmt to calculate the interest portion of a loan payment for any month 355 using cumipmt to calculate total interest payments during a time frame 356using cumprinc to calculate total principal paid in any range of periods 357 using effect to calculate the effect of compounding period on interest rates 358 using nominal to convert the effective interest rate to a nominal rate 359 examples of depreciation functions 359 using sln to calculate straight-line depreciation 360 using db to calculate declining-balance depreciation 361 using ddb to calculate double-declining-balance depreciation 362 using syd to calculate sum-of-years’-digits depreciation 364 functions for investment analysis 366 using the npv function to determine net present value 366 using irr to calculate the return of a series of cash flows 368 using mirr to calculate internal rate of return, including interest rates 369 using xnpv to calculate the net present value when the payments are not periodic 369 using xirr to calculate a return rate when cash flow dates are not periodic 371 examples of functions for bond investors 372 using yield to calculate a bond’s yield 373 using price to back into a bond price 374 using received to calculate total cash generated from a bond investment 376 using intrate to back into the coupon interest rate 377 using disc to back into the discount rate 378 handling bonds with an odd number of days in the first or last period 380 using pricemat and yieldmat to calculate price and yield for zero-coupon bonds 380 using pricedisc and yielddisc to calculate discount bonds 381 calculating t-bills 382 using accrint or accintm to calculate accrued interest 384 using duration to understand price volatility 384 examples of miscellaneous financial functions 386 using dollarde to convert to decimals 386 using fvschedule to calculate the future value for a variable scheduled interest rate 387 14 using statistical functions 389 examples of functions for descriptive statistics 401 using min or max to find the smallest or largest numeric value 402 using large to find the top n values in a list of values 403 using small to sequence a list in date sequence 405 using median, mode.sngl, mode.mult, and average to find the central tendency of a data set 406 using trimmean to exclude outliers from the mean 409 using geomean to calculate average growth rate 410 using harmean to find average speeds 411 using averageif or averageifs 411 using rank to calculate the position within a list 412using percentile.inc to calculate percentile 415 using percentrank.inc to assign a percentile to every record 417 using avedev, devsq, var.s, and stdev.s to calculate dispersion 418 examples of functions for regression and forecasting 421 considerations when using regression analysis 421 regression function arguments 423 functions for simple straight-line regression: slope and intercept 424 using linest to calculate straight-line regression with complete statistics 425 case study: application of regression analysis 427 using forecast to calculate prediction for any one data point 430 using trend to calculate many future data points at once 431 case study: forecasting using regression analysis 432 using logest to perform exponential regression 433 using growth to predict many data points from an exponential regression 435 exponential regression used to predict future generations 437 using pearson to determine whether a linear relationship exists 437 using rsq to determine the strength of a linear relationship 438 using steyx to calculate standard regression error 439 using correl to calculate positive or negative correlation 441 using fisher to perform hypothesis testing on correlations 443 using skew and kurtosis 443 examples of functions for inferential statistics 445 understanding the language of inferential statistics 445 using binom.dist to determine probability 448 using binom.inv to cover most of the possible binomial events 449 using negbinom.dist to calculate probability 450 using poisson.dist to predict a number of discrete events over time 451 using frequency to categorize continuous data 452 using norm.dist to calculate the probability in a normal distribution 455 using norm.inv to calculate the value for a certain probability 456 using norm.s.dist to calculate probability 457 using norm.s.inv to calculate a z score for a given probability 458 using standardize to calculate the distance from the mean 459 using chisq.test to perform goodness-of-fit testing 463 the sum of squares functions 464 testing probability on logarithmic distributions 467 using gamma.dist and gamma.inv to analyze queuing times 469 calculating probability of beta distributions 470 using f.test to measure differences in variability 472 other distributions: exponential, hypergeometric, and weibull 473 using z.test, confidence.norm, and confidence.t to calculate confidence intervals 476 using z.test to accept or reject a hypothesis 477 using permut to calculate the number of possible arrangements 478using the analysis toolpak to perform statistical analysis 479 installing the analysis toolpak in excel 2010 479 generating random numbers based on various distributions 480 generating a histogram 481 generating descriptive statistics of a population 483 ranking results 484 using regression to predict future results 485 using a moving average to forecast sales 487 using exponential smoothing to forecast sales 488 using correlation or covariance to calculate the relationship between many variables 490 using sampling to create random samples 491 using anova to perform analysis of variance testing 493 using the f-test to measure variability between methods 497 performing a z-test to determine whether two samples have equal means 498 performing student’s t-testing to test population means 499 using functions versus the analysis toolpak tools 501 15 using trig, matrix, and engineering functions 503 a brief review of trigonometry basics 507 radians versus degrees 507 pythagoras and right triangles 508 one side plus one angle trigonometry 509 using tan to find the height of a tall building from the ground 510 using sin to find the height of a kite in a tree 511 using cos to figure out a ladder’s length 512 using the arc functions to find the measure of an angle 514 using atan2 to calculate angles in a circle 516 emulating gravity using hyperbolic trigonometry functions 517 examples of logarithm functions 519 common logarithms on a base-10 scale 520 using log to calculate logarithms for any base 522 using ln and exp to calculate natural logarithms 523 working with imaginary numbers 526 using complex to convert a and b into a complex number 527 using imreal and imaginary to break apart complex numbers 528 using imsum to add complex numbers 529 using imsub, improduct, and imdiv to perform basic math on complex numbers 530 using imabs to find the distance from the origin to a complex number 531 using imargument to calculate the angle to a complex number 532 using imconjugate to reverse the sign of an imaginary component 533 calculating powers, logarithms, and trigonometry functions with complex numbers 533solving simultaneous linear equations with matrix functions 534 using mdeterm to determine whether a simultaneous equation has a solution 538 using seriessum to approximate a function with a power series 539 using sqrtpi to find the square root of a number multiplied by pi 540 using sumproduct to sum based on multiple conditions 541 examples of engineering functions 543 converting from decimal to hexadecimal and back 544 converting from decimal to octal and back 545 converting from decimal to binary and back 546 explaining the two’s complement for negative numbers 547 converting from binary to hex to octal and back 548 using convert to convert english to metric 549 using delta or gestep to filter a set of values 564 using erf and erfc to calculate the error function and its complement 566 calculating the bessel functions 567 using the analysis toolpack to perform fast fourier transforms (ffts) 568 16 connecting worksheets, workbooks, and external data 571 connecting two worksheets 571 creating links using paste options menu 573 creating links using right-drag menu 575 building a link by using the mouse 576 links to external workbooks default to absolute references 577 building a formula by typing 577 creating links to unsaved workbooks 578 using the links tab on the trust center 578 opening workbooks with links to closed workbooks 579 dealing with missing linked workbooks 579 preventing the update links dialog from appearing 580 connecting to data on a web page 581 setting up a connection to a web page 581 managing properties for web queries 584 setting up a connection to a text file 584 setting up a connection to an access database 588 setting up sql server, xml, ole db, and odbc connections 589 connecting to xml data 590 connecting using microsoft query 592 managing connections 59517 using super formulas in excel 599 using 3d formulas to spear through many worksheets 599 referring to the previous worksheet 600 combining multiple formulas into one formula 603 calculating a cell reference in the formula by using the indirect function 606 using offset to refer to a range that dynamically resizes 608 assigning a formula to a name 609 turning a range of formulas on its side 611 replacing multiple formulas with one array formula 613 setting up an array formula 615 understanding an array formula 616 coercing a range of dates using an array formula 616 18 using names in excel 619 use the name box to define a name for a cell 619 naming a cell by using the name dialog 621 using the name box for quick navigation 622 using scope to allow duplicate names in a workbook 622 using named ranges to simplify formulas 624 retroactively applying names to formulas 625 using names to refer to multiple-cell ranges 627 dealing with invalid legacy naming 627 adding many names at once from existing labels and headings 628 managing names 631 filtering the name manager dialog 632 using a name to simplify an absolute reference 633 using a name to hold a value 634 assigning a formula to a name 635 using basic named formulas 636 using dynamic named formulas 636 using a named formula to point to the cell above 63819 fabulous table intelligence 643 defining suitable data for excel tables 644 defining a table 644 keeping headers in view 645 freezing worksheet panes 646 clearing freeze panes 647 using the old version of freeze panes for absolute control 647 adding a total row to a table 649 toggling totals 649 expanding a table 650 adding rows to a table automatically 651 manually resizing a table 651 adding new columns to a table 651 adding new formulas to tables 652 stopping the automatic copying of formulas 653 formatting the results of a new formula 654 selecting only the data in the column 654 selecting by right-clicking 655 selecting by using shortcuts 655 selecting by using the arrow mouse pointers 656 using table data for charts to ensure stickiness 658 replacing named ranges with table references 659 referencing an entire table from outside the table 659 referencing table columns from outside a table 660 using structured references to refer to tables in formulas 662 creating banded rows and columns with table styles 663 customizing a table style: creating double-height banded rows 663 creating banded rows outside a table 665 dealing with the autofilter drop-downs 665 iii business intelligence 20 sorting data 667 introducing the sort dialog 668 using specialized sorting 669 sorting by color or icon 669factoring case into a sort 670 reordering columns with a left-to-right sort 671 sorting into a unique sequence by using custom lists 672 one-click sorting 674 sorting by several columns using one-click sorting 674 sorting randomly 675 21 removing duplicates and filtering 679 filtering records 679 using a filter 680 selecting one or multiple items from the filter drop-down 681 identifying columns with filters 682 combining filters 683 clearing filters 683 refreshing filters 683 resizing the filter drop-down 683 filtering by selection—hard way 684 filtering by selection—easy way 684 filtering by color or icon 685 handling date filters 687 using special filters for dates, text, and numbers 688 sorting filtered results 689 totaling filtered results 690 formatting and copying filtered results 690 using the advanced filter command 691 using remove duplicates to find unique values 693 removing duplicates based on several columns 694 handling duplicates other ways 695 combining duplicates and adding values 695 22 using automatic subtotals 697 adding automatic subtotals 697 working with the subtotals 699 showing a one-page summary with only the subtotals 699 sorting the collapsed subtotal view so the largest customers are on top 699 copying only the subtotal rows 701 formatting the subtotal rows 703 removing subtotals 705using specialty subtotal techniques 705 summing some columns while counting another column 705 adding a blank row after each subtotal 707 add subtotals by two fields 709 23 using pivot tables to analyze data 711 creating your first pivot table 713 when you have your data in the correct format, creating and changing a pivot table is easy. 713 dealing with the compact layout 715 rearranging a pivot table 716 finishing touches: numeric formatting and removing blanks 718 four things you have to know when using pivot tables 720 your pivot table is in manual calculation mode until you click refresh! 720 one blank cell in a value column causes excel to count instead of sum 720 if you click outside of the pivot table, all the pivot table tools disappear 721 you cannot change, move a part of, or insert cells in a pivot table 721 calculating and roll-ups with pivot tables 721 grouping daily dates to months and years 721 adding calculations outside the pivot table 724 showing percentage of total 725 showing running totals and rank 726 using a formula to add a field to a pivot table 726 formatting a pivot table 729 using the pivottable styles 729 finding more information on pivot tables 730 24 using slicers and filtering a pivot table 731 filtering using the row label filter 731 filtering using the search box 733 clearing a filter 735 filtering using the check boxes 735 filtering using the label filter flyout 735 filtering using the date filters 737 filtering using value filters 737 filtering to the top 10 739 filtering using report filter fields 742 arranging the filters 743 selecting multiple items 743filtering using slicers 745 adding slicers 745 arranging the slicers 746 formatting the slicers 747 using the slicers 747 filtering oddities 749 autofiltering a pivot table 749 applying row label filters to fields not in the pivot table report 749 replicating a pivot table for every customer 750 sorting a pivot table 750 why not sort using the data tab? 752 25 mashing up data with powerpivot 753 benefits and drawbacks to powerpivot 753 mega-benefits of powerpivot 753 moderate benefits of powerpivot 754 why is this free? 754 benefits of the server version of powerpivot 755 drawbacks to using powerpivot 756 installing powerpivot 756 case study: building a powerpivot report 757 import a text file 757 add excel data by copying and pasting 761 add excel data by linking 763 define relationships 764 add calculated columns using dax 764 build a pivot table 766 slicers in powerpivot 768 some things are different 770 two kinds of dax calculations 771 dax calculations for calculated columns 771 using related() to base a column calculation on another table 774 using dax to create new measures 777 count distinct using dax 777 when “filter, then calculate” doesn’t work in dax measures 781 mix in those amazing time intelligence functions 784 other notes 787 combination layouts 787 report formatting 788refreshing powerpivot versus refreshing pivot table 789 getting your data into powerpivot with sql server 789 other issues 790 26 using what-if, scenario manager, goal seek, and solver 791 using what-if 791 creating a two-variable what-if table 792 using scenario manager 795 creating a scenario summary report 798 adding multiple scenarios 800 using goal seek 802 using solver 807 installing solver 807 solving a model using solver 808 27 automating repetitive functions using vba macros 817 checking security settings before using macros 817 enabling vba security 818 recording a macro 818 case study: macro for formatting for a mail merge 819 how not to record a macro: the default state of the macro recorder 822 relative references in macro recording 824 starting the macro recorder 824 running a macro 826 everyday-use macro example: formatting an invoice register 827 using the end key to handle a variable number of rows 827 editing a macro 830 understanding vba code—an analogy 831 comparing object.method to nouns and verbs 832 comparing collections to plural nouns 832 comparing parameters to adverbs 832 comparing adjectives 836 using the analogy while examining recorded code 836 using simple variables and object variables 837 using r1c1-style formulas 838 fixing calculation errors in macros 840 customizing the everyday-use macro example: getopenfilename and getsaveasfilename 840from-scratch macro example: loops, flow control, and referring to ranges 842 finding the last row with data 842 looping through all rows 843 referring to ranges 844 combining a loop with finalrow 844 making decisions by using flow control 845 putting together the from-scratch example: testing each record in a loop 846 a special case: deleting some records 847 combination macro example: creating a report for each customer 849 using the advanced filter for unique records 851 using autofilter 853 selecting visible cells only 854 combination macro example: putting it all together 855 28 more tips and tricks for excel 2010 857 speeding up calculation by using multithreaded calculation 857 watching the results of a distant cell 858 opening the same files every day 859 comparing documents side by side with synchronous scrolling 860 calculating a formula in slow motion 861 inserting a symbol in a cell 862 edit an equation 863 adding a digital signature line to a workbook 864 protecting a worksheet 865 sharing a workbook 866 separating text based on a delimiter 867 translating text 868 29 tour of the best add-ins for excel 871 charting utilities from jon peltier 871 creating dashboards by using speedometer chart creator 872 add labels to xy charts 872 loading pdf data to excel by using able2extract 873customizing the ribbon using customizeribbon 873 accessing more functions by using morefunc.dll 873 general purpose utility suites 874 utilities for data analysis tasks 874 iv visual presentation of data 30 formatting worksheets 875 why format worksheets? 875 using traditional formatting 877 changing numeric formats by using the home tab 879 changing numeric formats by using built-in formats in the format cells dialog 881 changing numeric formats using custom formats 885 aligning cells 889 changing font size 890 changing font typeface 890 applying bold, italic, and underline 891 using borders 892 coloring cells 894 adjusting column widths and row heights 896 using merge and center 897 rotating text 899 formatting with styles 901 understanding themes 903 choosing a new theme 903 creating a new theme 905 other formatting techniques 913 formatting individual characters 913 changing the default font 914 wrapping text in a cell 915 justifying text in a range 916 adding cell comments 917 copying formats 919 pasting formats 919 pasting conditional formats 921 using the format painter 921 copying formats to a new worksheet 92131 using data visualizations and conditional formatting 925 using data bars to create in-cell bar charts 926 creating data bars 928 customizing data bars 928 showing data bars for a subset of cells 930 using color scales to highlight extremes 931 customizing color scales 931 using icon sets to segregate data 932 setting up an icon set 933 moving numbers closer to icons 934 using the top/bottom rules 935 setting up conditional formatting rules 935 using the highlight cells rules 937 highlighting cells by using greater than and similar rules 937 comparing dates by using conditional formatting 939 identifying duplicate or unique values by using conditional formatting 940 using conditional formatting for text containing a value 941 tweaking rules with advanced formatting 942 using a formula for rules 944 finding cells within three days of today 945 finding cells containing data from the past 30 days 945 highlighting data from specific days of the week 946 highlighting an entire row 946 highlighting every other row without using a table 947 combining rules 947 clearing conditional formats 949 extending the reach of conditional formats 950 special considerations for pivot tables 950 32 using excel charts 953 understanding the components of a chart 954 setting up data for charting 955 inserting a chart by choosing a chart type 956 using the create chart dialog 957changing a chart’s type 960 moving or resizing a chart 960 choosing a chart layout to further customize the chart type 961 customizing a chart using the chart tools tabs 962 customizing a chart by using the design tab 963 changing chart settings using the layout tab 964 micromanaging using the format tab 966 charting tips and tricks 967 showing numbers of different scale on a chart 967 creating a chart with one keystroke 970 adding new data to a chart by pasting 970 adding new data to a chart by using a table 971 adding drop lines to a surface chart 972 predicting the future by using a trendline 973 creating stock charts 974 dealing with small pie slices 976 displaying three variables by using a bubble chart 978 changing the location of a chart 978 saving a favorite chart style as a template 978 using pivot charts 979 33 using sparklines 983 fitting a chart into the size of a cell with sparklines 983 understanding how excel maps data to sparklines 984 creating a group of sparklines 986 built-in choices for customizing sparklines 988 controlling axis values for sparklines 990 setting up win/loss sparklines 992 showing detail by enlarging the sparkline and adding labels 993 other sparkline options 995 34 using smartart, shapes, wordart, and text boxes 997 using smartart 998 elements common in most smartart 998 tour of the smartart categories 999 inserting smartart 1000 micromanaging smartart elements 1005 controlling smartart shapes from the text pane 1006adding images to smartart 1009 special considerations for organizational charts and hierarchical smartart 1010 using limited smartart 1013 deciphering the labeled hierarchy layouts 1014 using shapes to display cell contents 1015 working with shapes 1016 using the freeform shape to create a custom shape 1017 using wordart for interesting titles and headlines 1018 35 using pictures and clip art 1023 using pictures on worksheets 1023 formatting with picture styles 1024 resizing and cropping pictures 1024 reducing a picture’s file size 1026 adjusting a picture 1027 adding borders 1030 removing the background 1030 arranging pictures 1032 displaying the selection pane 1034 adding captions to images 1034 inserting screen clippings 1035 using clip art 1036 v sharing 36 printing 1039 printing from backstage view 1039 choosing a printer 1041 choosing what to print 1041 changing printer properties 1042 changing some of the page setup settings 1042 using print preview controls 1043 closing backstage view 1044 printing using quick print 1044 using page layout view 1044 using the improved headers and footers 1046 adding an automatic header 1046 adding a custom header 1047inserting a picture in a header 1048 using different headers and footers in the same document 1048 scaling headers and footers 1050 using the page setup and sheet options 1050 adjusting worksheet margins 1050 adjusting worksheet orientation 1052 setting worksheet paper size 1052 setting the print area 1052 adding print titles 1053 scaling options 1053 printing gridlines and headings 1054 working with page breaks 1055 manually adding page breaks 1055 manual versus automatic page breaks 1056 using page break preview to make changes 1056 removing manual page breaks 1057 37 excel web app and other ways to share workbooks 1059 sharing workbooks with others 1059 using the excel web application 1059 advantages of creating a client version of your workbook 1065 sending a workbook via email 1066 creating a pdf from a worksheet 1066 publishing to excel services on sharepoint 1069 interacting with other office applications 1070 pasting excel data to microsoft onenote 1071 using excel charts in powerpoint 1072 creating tables in excel and pasting to word 1073 pasting word data to an excel text box 1073 using excel data in a word mail merge 1074 building a pivot table from access queries 107638 saving time using the easy-xl program 1079 downloading and installing easy-xl 1079 easy-xl works best with tabular data 1080 doing away with vlookup 1080 using a fuzzy match 1082 text to columns on steroids 1086 sorting columns left to right 1088 summarizing data 1088 adding statistics to the report 1089 getting quick statistics 1090 transforming data instead of trim(), proper(), clean() 1091 adding text to cells 1091 filling in the annoying outline view 1093 there’s more 1093 deal with fiscal years 1095 record easy-xl commands into vba macros 1095 index 1097about the author bill jelen, excel mvp and the host of mrexcel.com, has been using spreadsheets since 1985, and he launched the mrexcel.com website in 1998. bill was a regular guest on call for help with leo laporte and has produced more than 1,200 episodes of his daily video podcast, learn excel from mrexcel. he is the author of 30 books about microsoft excel and writes the monthly excel column for strategic finance magazine. you will most frequently find bill tak-ing his show on the road, doing half-day power excel seminars wherever he can find a room full of accountants or excellers. before founding mrexcel.com, bill jelen spent 12 years in the trenches — working as a financial analyst for finance, marketing, accounting and operations departments of a 500 million public company. he lives near akron, ohio with his wife, mary ellen, and his sons, josh and zeke. acknowledgments excel 2007 and excel 2010 brought tremendous new gains to spreadsheets. david gainer at microsoft led the excel team through these two amazing versions. thanks to all the excel project managers who were happy to take the time to discuss the how or why behind a fea-ture. the powerpivot team of donald farmer, rob collie, and amir netz were tremendously helpful in getting me up to speed with powerpivot. thanks to dan bricklin and bob frankston for inventing the computer spreadsheet. thanks to mitch kapor for lotus 1-2-3. like everyone else who uses computers to make a living, i owe a debt of gratitude to these three pioneers. i’ve learned that when writing a 1,000-page book, there is not much time for anything else. thanks to tracy syrstad, barb jelen, schar oswald, and scott pierson for keeping mrexcel running while i wrote. as always, thanks to the hundreds of people answering 30,000 excel questions a year at the mrexcel message board. thanks to wei jiang and jake hildebrand for their programming expertise. michael janscy provided great consulting help on the new statistical distribution functions. at pearson, loretta yates is an awesome acquisitions editor. if you have ever written a book for any other publisher, you are missing out by not working with loretta yates. bob umlas is the smartest excel guy that i know and i am thrilled to have him as the technical editor for this book. thanks to craig crossman and everyone at the computer america radio show. thanks to some early computing influences: carl bevington, khalil matta, gary kern, and hector guerrero. thanks to my friend and client jerry kohl. your ideas about how to make excel sing are fantastic. finally, thanks to josh jelen, zeke jelen, and mary ellen jelen. in particular, it was mary ellen who realized that things had to change if i was going to get the books done on time. honey, you can put the whip away until excel 15.we want to hear from you! as the reader of this book, you are our most important critic and commentator. we value your opinion and want to know what we’re doing right, what we could do better, what areas you’d like to see us publish in, and any other words of wisdom you’re willing to pass our way. as an associate publisher for que publishing, i welcome your comments. you can email or write me directly to let me know what you did or didn’t like about this book—as well as what we can do to make our books better. please note that i cannot help you with technical problems related to the topic of this book. we do have a user services group, however, where i will forward specific technical questions related to the book. when you write, please be sure to include this book’s title and author as well as your name, email address, and phone number. i will carefully review your comments and share them with the author and editors who worked on the book. email: feedback@quepublishing.com mail: greg wiegandassociate publisher que publishing 800 east 96th street indianapolis, in 46240 usa reader services visit our website and register this book at www.quepublishing.com/register for convenient access to any updates, downloads, or errata that might be available for this book.this page intentionally left blanki was amazed when excel 2007 upped the row limit from 65,536 rows to 1.1 million rows. (excel 2010, when combined with the powerpivot add-in, can now sort, filter, and pivot 100 million rows!) combine the 1 million rows with new charting, data visualizations, intelligent business diagrams, the sumifs function, remove duplicates, page layout view, table functionality, and excel 2007 should have been the best-selling version of all time. but it departments everywhere delayed rolling out office 2007 because the familiar menus and toolbars had been replaced by the ribbon. suddenly, all the commands that people knew where to locate were shuffled around. this was fine for people new to excel, but it meant some time to get up the learning curve for the 500 million people already using excel. it was fun to complain about why microsoft would remove the familiar menus and toolbars from excel 2007. we could mock them for replacing the all-important file menu with an icon so that no one could figure out how to print a document. but the excel versions march on, and the rib-bon is here to stay. when you get right down to it, there are really two things to learn about the ribbon: (1) pivot tables are on the insert tab instead of the data tab. (2) most of the stuff that you think would be on the insert tab is under the insert drop-down on the home tab. master those two facts and the rest of the stuff is in a logical place. besides, microsoft replaced the mysterious wordless symbol on the file menu with the word “file” and dramatically improved that file menu into a full-screen backstage view (see chapter 1). you can now customize the ribbon so that you can move the pivottable drop-down back to the data tab where it belongs (for customizing the ribbon, see chapter 3). and microsoft added even more new features to excel 2010. introductionint roduct ion 2 new in excel 2010 every new office version has a set of themes, and the new features are grouped around those themes. for excel 2010, the themes were to improve excel’s reputation in the scientific community and to make excel the premier tool for business intelligence. as a former data analyst, i love the new features for analyzing data. i also remember that while i loved wrangling large data sets into meaningful analyses, i never wanted to spend the time to make those meaningful analyses look “pretty.” excel 2010 offers new graphics improvements that make it easy to add some visual inter-est to your numbers. improvements in business intelligence • powerpivot add-in— you can now sort, filter, and pivot data sets that are beyond 1 million rows. the powerpivot tool allows you to mash up 100 million rows of data from excel, text files, rss feeds, sql server, oracle, and more. a new dax expression language offers time intelligence functions that enable you to compare fiscal year-to-date sales with the parallel period from a year ago. see chapter 24 for more about powerpivot. • pivot table slicers— filtering data in pivot tables becomes visual with graphical filters known as slicers. in previous versions of excel, the filter drop-downs offered the capability to choose multiple items, but no one reading the report could tell what was included or not included. these new graphical filters show what is in the summary report and invite people to do ad-hoc analy-ses by choosing new options from the slicers. see chapter 25 for details on slicers. • asymmetric pivot tables— do you need to show last year’s actuals versus this year’s budget? that was hard to do in previous versions of excel, but the named sets command for pivot tables created from olap data make it easy in excel 2010. don’t have olap? run your excel data through powerpivot to enable named sets. see chapter 24. • percentage of parent item in pivot tables— new calculations in the show values as drop-down allow for calculations such as percentage of parent row, rank, and more. • aggregate function— whereas excel 2007 added the plural sumifs function, the killer func-tion in excel 2010 is aggregate. this function is like the subtotal function on steroids. you have 19 calculation options instead of the 11 in subtotal, plus the capability to ignore hidden rows, filtered rows, or error cells. read about aggregate in chapter 11. presentation improvements • tiny charting with sparklines— edward tufte published his first descriptions of the intense, tiny, word-sized charts in a book in 2006. microsoft incorporated three types of sparklines in excel 2010. chapter 33 will describe how to leverage sparklines. sparklines join the excel 2007 data visualizations. in that area, the old data bars feature gets a makeover with new options. see chapter 31.3 new in excel 2010 • new smartart layouts— fifty new business diagram layouts bring the total to more than 130 types of business diagrams that you can create with the smartart tools. learn about smartart in chapter 34. • better picture tools and background removal— if you need to dress up a report with a photo-graph, improved picture tools help to correct the picture. a fairly cool background removal tool will have you creating odd-shaped picture elements. see chapter 35. also in this chapter, learn about a new screen clipping tool that allows you to paste a picture of a section of any screenshot into your worksheet. excel interface improvements • paste options flyout— if you study the undo command, can you guess which operation immedi-ately precedes undo most often? it is paste or one of the paste variants. the new paste options flyout will help you paste data with/without borders, formulas, links, column widths, and so on. ctrlc, ctrlv, ctrl, v is becoming my favorite key sequence. see chapter 4. • file menu becomes backstage view— most ribbon tabs contain commands that you use while working in your document. the file menu contains commands that you do to the entire docu-ment when you are done with your document. the team at microsoft figured that these com-mands don’t require you to see the document, so all the screen real estate is taken up with a new full-screen file menu. read about backstage view in chapter 1. • ribbon customizations— this should not even be noted in the book because it has been possible to customize ribbons and toolbars for more than a decade. however, excel 2007 offered no way in the excel interface to customize the ribbon. chapter 3 shows you how to customize the ribbon in excel 2010. improvements for the science community • improved function accuracy— various academic papers had attacked the accuracy of some of excel’s statistical, financial, and math functions. microsoft hired two outside math consultancy firms to rewrite the algorithms for a number of functions and then hired a third firm to vali-date which algorithm was the best for each function. see chapter 10 for a discussion of these improvements. • consistency in function names— the statistical function set had been a confusing jumble of functions. for every distribution, you always had to check help to see whether the particular function was a cumulative function, left-tailed, right-tailed, and so on. you will begin to see many new functions that include the dot in their name. those dots will lead to an easier-to-understand description of the function. for statistical distribution functions, a name like distribution.dist will be the left-tailed cumulative distribution when the cumulative parameter is true and the probability density function when the cumulative parameter is false. if you need a right-tailedint roduct ion 4 distribution, look for distribution.dist.rt. for two-tailed, distribution.dist.2t. the inverse functions will be distribution.inv. functions based on a sample will be function.s and func-tions based on a population will be function.p. excel still supports the old names, but new formulas can be created with var.s and var.p instead of the old var and vars. see chapters 10 and 14 for a description of these new function names. • equation editor— you can now insert a variety of equations into a text box in excel. the equation tools support radicals, integrals, matrices, brackets, functions, and symbols. you can convert the equation into a linear view for editing and then press a button to turn it back into a 2d equation. upgrading from excel 2003 or earlier a large percentage of the people upgrading to excel 2010 are people who skipped over excel 2007. this book assumes that things that were new in excel 2007 are still new to over half of the readers. if you are upgrading from excel 2003, watch for these new features: • massive grid— 1.1 million rows and 16,000 columns. that’s 17 billion cells, just on sheet 1. read about the big grid in chapter 7. • data visualizations— icon sets, color scales, and in-cell bar charts are found on the conditional formatting drop-down of the home tab. see chapter 31 for more on these tools. • better looking charts— the excel 2007 charting engine is new. the charts look better. the charting interface has improvements, but it is harder to do some things than in excel 2003. read about charts in chapter 32. • remove duplicates— you can remove duplicates with a couple of clicks using the new command on the data tab. • page layout view— edit headers and footers in place using the new page layout view. how this book is organized the book is organized into the following parts: • part i, “mastering the new user interface”— this first part of the book shows you the ribbon, backstage view, mini toolbar, quick access toolbar, and more. • part ii, “calculating with excel”— this part covers what excel does best, from formulas to func-tions to linking. • part iii, “business intelligence”— sorting, filtering, subtotals, pivot tables. these are the tools of the excel data analyst. learn about these tools and the new powerpivot add-in in part iii. the chapter on vba macros is also in this part. • part iv, “visual presentation”— this part covers charting, smartart, data visualizations, and picture tools. after you get done analyzing the data, a few features from this part will make your reports look good.5 convent ions used in thi s book • part v, “sharing information”— this part discusses printing and sharing your excel workbooks by creating pdfs, or publishing to the web. conventions used in this book the special conventions used throughout this book are designed to help you get the most from the book as well as from excel 2010. text conventions different typefaces are used to convey various things throughout the book. they include those shown in table i.1. table i.1 typeface conventions typeface description monospace screen messages appear in monospace. italic new terminology appears in italic. bold references to text you should type appear in bold. ribbon names, dialog box names, and dialog box elements are capitalized in this book (for example, add formatting rule dialog, home tab). in this book, key combinations are represented with a plus sign. if the action you need to take is to press the ctrl key and the t key simultaneously, the text tells you to press ctrlt. there were not many changes from excel 97 to excel 2000 to excel 2002 to excel 2003. most people upgrading to excel 2010 will be coming from one of these versions of excel. i will collectively refer to these versions as “legacy versions of excel.” special elements throughout this book, you’ll find tips, notes, cautions, cross-references, case studies, excel in practice boxes, sidebars, and troubleshooting tip boxes. these elements provide a variety of information, ranging from warnings you shouldn’t miss to ancil-lary information that will enrich your excel experience but isn’t required reading. cross references see chapter 99 for more information. note notes contain extra information or alternative techniques for per-forming tasks. tip tips point out special features, quirks, or software tricks that will help you increase your pro-ductivity with excel 2007.int roduct ion 6 the author has more than 1,200 excel podcast episodes available. certain topics in the book will refer you to a video demo in the excel in depth channel at youtube. caution cautions call out potential gotchas. case study: other elements sections such as case study, excel in practice, and troubleshooting tips are set off in boxes such as this one: • case studies walk you through the steps to complete a task. • excel in practice boxes walk through real-life problems in excel. • troubleshooting tips boxes walk through steps to avoid certain problems or how to react when certain problems occur. sidebars historical glimpses and other information that is not critical to your understanding appear as sidebars. i imagine that if the cliff claven character from cheers knew a lot about excel, these would be the kinds of things he would write.the file menu becomes the backstage view open the file menu in excel 2010 and you might be shocked to see that it takes up 100% of your screen real estate. this new panel is called the backstage view and represents a significant development effort in office 2010. understanding “in” versus “out” commands microsoft added the ribbon to excel 2007 to make working in your spreadsheet easier. microsoft describes the typical wysiwyg interac-tion model for a command such as bolding a cell as follows: • scan the worksheet. • see something you want to change. • find the command. • use the command. • see the results on the page. all the commands on the home, insert, page layout, formulas, data, review, and view tabs of the ribbon are commands that you use when you work in your document. microsoft calls these commands the “in” commands. by contrast, the commands on the file menu are not for working in the document. you use these commands after you finish the document and you are ready to do something with it. perhaps you want to print, 1the fi le menu becomes the backstage view 8 1 part or email, or save the document. all the out commands share the characteristic that they don’t act on a specific point in the work-sheet and their results don’t appear in the spreadsheet. for lack of a better term, microsoft calls these commands the “out” com-mands. after developing the ribbon for excel 2007, microsoft felt it had simplified the process of finding the in commands. but the wysiwyg model falls apart for the out commands: • you can’t scan the worksheet for something you want to change. the status of the out features doesn’t appear in the worksheet. • the out commands suffered from low discoverability. whereas people using excel 2007 figured that an editing command had to exist somewhere, people were not searching for the out commands. for example, some items, such as the document inspector, were hidden several layers deep and rarely found. to address these issues, microsoft added the backstage view to provide a consistent home for the out commands. using backstage view to open the backstage view, click the file menu. the backstage view fills the screen, as shown in figure 1.1. backstage is split into three sections: the narrow left navigation panel and two wider sections that provide information. pressing the esc key to close backstage view to get out of backstage and return to your worksheet, you can either press the esc key or click another ribbon tab. using the four quick commands in the left navigation four commands are so important that they get their own real estate in the top of the left naviga-tion pane: save, save as, open, and close. the save and close commands act as soon as you select them. the save as and open commands bring up another dialog box where you can save or open a document. not much has changed in the save as or open dialog boxes. at the bottom of the left navigation are two additional quick commands: options and exit. the options command leads to the excel options dialog box. see chapter 6 , “the excel options dialog,” for details on the excel options. note first, it is great to note that once again a file menu exists in excel 2010. for the past 3 years, people using excel 2007 had a version of office in which the file menu had been replaced by a branding decoration. this led to huge frustrations as people searched the other ribbon tabs trying to find items that used to be on the file menu. although this frustration gave my power excel seminar audiences a shared common experience and a reason to laugh at microsoft’s folly, it is seriously good to see that microsoft abandoned this idea and came back with a file menu. as late as july 2009, the technical preview of office 2010 was sticking with the branding element instead of the file menu. someone in the usability lab had the genius to run a test in which they tried putting the let-ters “f-i-l-e” on the tab instead of the office icon. microsoft reports that discoverability of the backstage view increased dramatically, so it abandoned the branding decoration.9 opening recent fi les 1 chapter the remaining entries in the left navigation each lead to large gal-leries of information and commands: info, recent, new, print, save & send, and help. each of these will be described in the sections that follow. opening recent files if you have no workbook open in excel and you go to the file menu, you start in the recent gallery (see figure 1.3). in the central panel, you have a list of recent workbooks. as in excel 2007, this list can show up to 50 documents instead of the 9 that were available in excel 2003. the central panel includes a vertical scrollbar so that you can scroll to more documents than are visible on your screen. tip because the commands in the backstage view are all out com-mands, they do not operate on a specific point in the spreadsheet. thus, you do not need to actually see your spreadsheet while you use the commands. microsoft takes the liberty of having the backstage view fill the entire width of the screen. the screen is divided into three asymmetric parts: a narrow left navigation pane, a medium-width central command gallery, and a wide information pane on the right side. left navigation gallery sections quick commands ribbon tabs (click to exit) figure 1.1 backstage view.the fi le menu becomes the backstage view 10 1 part exit excel close workbook exit excel figure 1.2 none of the visible click-able “x” ele-ments close backstage. figure 1.3 the recent pane.11 opening recent fi les 1 chapter the right side of the screen includes a great new feature called recent places. this list shows all the folders from where you have recently opened documents. if i am currently working on next year’s budget, it is likely that i will have to open other documents in the \futurebudget\ folder. having the list of folders on the right side is a feature that you will find yourself using frequently. when you click a folder on the right side of the screen, you are taken to the standard excel open dialog, but it will already be showing the selected folder. in figure 1.3, notice the two columns of gray pushpins. if you click one of those gray pushpins, it turns into a colored pushpin that appears to be pushed into the screen. this pins an item to either the recent workbooks or the recent places list. when an item is pinned to the list, it stays at the top of the list, no matter how many other files you might open. this is great for the files that you use for month-end report-ing. even if you happen to open 153 other files during the course of the month, when the next month-end period rolls around, you can be sure that your month-end report file is at the top of the recent list. one-click access to recent files during the beta for office 2010, many people complained that the recent file list is now two clicks away instead of one click away. when any excel workbook is open, clicking the file menu takes you to the info panel instead of the recent gallery. people complained that they had twice as many clicks to select file, recent before they could see their recently used files. there are two solutions for this. first, you can easily add the recent file list to the quick access toolbar. at the right side of the quick access toolbar is a drop-down arrow. open this drop-down arrow to reveal 12 common com-mands you can add to the quick access toolbar. (see figure 1.4) if you choose open recent file from this list, you have an open recent file icon on your quick access toolbar. click this icon to go directly to the recently used file list. for people who like the keyboard accelerators, using alt4 invokes the fourth icon on the quick access toolbar, so there is a one-keystroke method for accessing the recently used file list, pro-vided it is one of the first nine icons on your quick access toolbar. there is a second method for quickly accessing recent workbooks. at the bottom of the recent workbooks list is a check box and a spin button. on a 900x1440 monitor, you have room for about 10 more items along the left navigation panel. this check box allows you to fill that space with recent files. select the check box and choose any number of files. in figure 1.5, 10 new files are shown in the left navigation. if you have items pinned to the recent workbooks list, they are shown immediately after the quick commands in the left navigation. the rest of the space is filled with recent documents. tip initially, the recent workbooks list includes up to 25 docu-ments. if you visit file, options, advanced, display, you find a setting for show this number of recent documents. dial this up to 50 documents. tip if you were a fan of the keyboard shortcut of altf1 to open the most recent file, you will appre-ciate that using this check box allows altf1 to continue to work as expected.the fi le menu becomes the backstage view 12 1 part open recent file icon qat drop-down figure 1.4 add open recent workbooks to the qat. figure 1.5 use altf1 to access the first item pinned to your recent workbooks list.13 opening recent fi les 1 chapter to see a demo about the recent files pane, search for “excel in depth 1” at youtube. recovering unsaved workbooks as in legacy versions of excel, the autosave feature can create copies of your workbook every n minutes. (“legacy” in this book refers to excel 2003 and earlier versions.) if an autosave version of your workbook exists, you can now access that file using the recover unsaved workbooks icon at the bottom of the recent places list. do you ever get to the end of your workday, use the x to close excel, and then are greeted with a barrage of do you want to save questions, as shown in figure 1.6? i frequently forget that the nth workbook that i have open is not saved. i will think that i had opened these workbooks to get infor-mation, that i had not made any significant changes, and will either start clicking don’t save repeat-edly, or will hold down shift and click don’t save, which is equivalent to clicking the nonexistent don’t save to all selection. figure 1.6 if you accidentally click don’t save after excel has autosaved, you might be able to recover the document. as i see that important file get closed, i realize that i just lost all my changes to that file and cringe. this is a common problem that happens to everyone sooner or later. provided that the file was open long enough to experience an autosave, you may be able to get the file back. go to recover unsaved workbooks and find the date and time of the last autosave. it might be within 5 minutes of the last time you edited a cell in that document. when you find the file and open it, the information bar reports that this is a recovered unsaved file (see figure 1.7). click save as to give the file a name. figure 1.7 excel recovers the file. you need to save as to make the recovery permanent.the fi le menu becomes the backstage view 14 1 part clearing the recent workbooks list if you need to clear out the recent workbooks list, you should visit file, options, advanced, display. set the show this number of recent documents list to zero. this is unlike the behavior in excel 2003. in excel 2003, to clear out the ninth item from the list, you had to reset only the number of files back to 8 and excel would forget about number 9. in excel 2010, if you switch from 50 files to 1 file, then back to 50 files, all 50 files will come back. the only way to clear the recent workbooks list is to set the value back to zero. you can then reset it to 50 and excel will start col-lecting history again. getting information about the current workbook when a workbook is open and you go to the file menu, you start in the info gallery for that work-book. as shown in figure 1.8, the info pane lists all sorts of information about the current workbook. in legacy versions of excel, much of this information might have been tough to determine without using vba or even going out to windows explorer. some of the information is now at your fingertips in the info gallery: • the workbook path is shown at the top of the gallery. you can select the text in the path, and use ctrlc to copy and then paste the path wherever you might need to paste it. • you can see the file size. • you can see when the document was last modified and who modified it. • if any special states exist, these will be reported at the top of the middle pane. special states might include the following: • macros not enabled . • links not updated. • checked out from sharepoint. • you can see if the file has been autosaved and recover those autosaved versions. • you can mark the document as final that will cause others opening the file to initially have a read-only version of the file. • you can edit links to other documents. • a thumbnail of the current window of the document appears in the top right in case you forgot which document you are editing. • you can add tags or categories to the file. caution this raises some privacy con-cerns. if you are frequently work-ing on documents that you don’t want others to see, they might be able to recover them from the last autosaved version. i don’t recommend it, but if you need to, you can turn off this new feature. to do this, go to file, options, save, and clear the keep the last autosaved version if i close without saving check box.15 get t ing informat ion about the cur rent workbook 1 chapter • using the check for issues drop-down, you can run a compatibility checker to see if the work-book is compatible with legacy versions of excel. you can run an accessibility checker to see if any parts of the document will be difficult for people with disabilities. you can run a document inspector to see if any private information is hidden in the file. figure 1.8 the info gallery includes all of the prop-erties of the file. correcting special states such as disabled macros and links open a workbook that contains macros and external links. a message appears in the information bar that this content has not been enabled. if you visit the backstage view, the fact that the content has not been enabled appears at the top of the center pane (see figure 1.9). figure 1.9 you see a summary that certain content is not enabled.the fi le menu becomes the backstage view 16 1 part open the enable content drop-down and you are presented with an easy way to make the document a trusted document (see figure 1.10). if you choose the top item in the drop-down, all the content will be enabled and you will no longer be asked about enabling this content in this file on this computer. note these rules apply only to work-books stored on a local hard drive. if the file is stored on a network, then excel is going to ask you about enabling macros every time that you open the file. figure 1.10 it is now easier to trust a document using this drop-down. excel’s automatic trusting of a document say that you have an .xlsm workbook stored on a local hard drive. this workbook has some macros that you recorded. if you open this workbook and select to enable macros, then excel assumes that you are ok with the macros and automatically enables those macros the next time. this is a good feature. excel asks you only once about the file. this makes it far more likely that you will actually leave the excel macro security setting so that all macros are disabled with notification. if you send the workbook to someone else and they save it on their computer, they will be presented with the enable macros question again, but they would be asked about it only once. opening a file in the protected view sandbox another new feature in excel 2010 is the protected view. if you download a workbook from the internet or another unsafe location, you can open the workbook, but it will default to something called protected view. in this view, you cannot edit the workbook. macros will not run. links will not update. the theory is that you can actually look around the document, look for macros, look for links, and so on before you decide to trust the document. after making sure that the document is safe, you can click the enable editing button in the info bar (see figure 1.11). marking a workbook as final to prevent editing open the protect workbook icon in the info gallery to access a setting called mark as final (see figure 1.12). this marks the workbook as read-only. it will prevent someone else from making changes to your final workbook.17 get t ing informat ion about the cur rent workbook 1 chapter of course, if the other person visits the info gallery, that person can reenable editing. this feature is simply designed to warn the other people that you’ve marked it as final and no further changes should happen. if you can convince everyone in your workgroup to sign up for a windows live id, you can use the restrict permission by people setting. this layer of security allows you to define who can read, edit, and/or print the document. finding hidden content using the document inspector the document inspector can find a lot of hidden content, but it is not perfect. still, finding 95% of the types of hidden content can protect you a lot of the time. to run the document inspector, select file, info, check for issues, inspect document, and click ok. the results of the document figure 1.11 workbooks from the internet open in a protected sandbox. figure 1.12 mark a document as read only. caution the document inspector is not foolproof. do you frequently hide settings by changing the font color to white or by using the ;;; custom number format? this won’t be found by the document inspector. the document inspector also won’t note that you had scrolled over outside the print area and jotted your after-work grocery list in column x.the fi le menu becomes the backstage view 18 1 part inspector shown in figure 1.13 shows that the document has personal information stored in the file properties (author’s name) and a hidden worksheet. figure 1.13 look for hidden personal items in the workbook. creating a new workbook from a template if your manager asks you to create a new calendar, expense report form, invoice, and so on, you should visit the new category in backstage view. there are hundreds of templates already prebuilt on office online. although the gallery initially offers budgets, calendars, and seven other categories, as shown in figure 1.14, you can browse through the more categories folder for hundreds of other types of prebuilt worksheets. these are free for your use as a registered owner of microsoft excel. printing and print preview print preview has been moved to the print category in backstage view. backstage view consolidates settings that used to be in page setup, in the print dialog, in print preview, and in the printer settings dialog. (see figure 1.15) tip if you just want a new blank workbook, use ctrln and avoid the hassle of file, new, blank workbook, create.19 pr int ing and pr int preview 1 chapter figure 1.14 many pre-built tem-plates are available for your use from office online. figure 1.15 print preview and printer settings are combined in backstage view.the fi le menu becomes the backstage view 20 1 part microsoft created a brand new type of gallery for the print section of backstage view. rather than have a drop-down for orientation, this new gallery shows you the current orientation setting. if you need to change the orientation, you can open the drop-down to access more settings. but if the set-ting is correct, you don’t have to access the drop-down at all. note that ctrlp now brings you to the printing gallery in backstage view. if you want to do quickprint, you should add the quickprint icon to the quick access toolbar. printing is covered in detail in chapter 36, “printing.” sharing your workbook using save & send the save & send gallery in backstage view offers several categories of changes. each item in the central pane leads to more choices in the right pane. figure 1.16 shows send using e-mail. the right pane offers choices to attach the workbook to an email, send as pdf, send as xps, or send as internet fax. the save to web allows you to save your workbook to your windows live account. save to sharepoint allows you to save your document to your company’s sharepoint library. figure 1.16 attach your work-book to an email.21 get t ing updates and help 1 chapter change file type offers to let you convert your file to an excel 97-2003 workbook, an opendocument spreadsheet, a binary workbook, or other file types (see figure 1.17). finally, create pdf/xps document allows you to save your workbook as an adobe pdf file or the competing xps format from microsoft. figure 1.17 convert your file to one of several file types. getting updates and help you can access excel help at any time by pressing f1 or by clicking the blue question mark in the top-right corner of the excel screen. if you go to the help gallery in backstage view, you have access to help, getting started, contact us, options, and check for updates. help is the same as clicking the blue question mark icon. options is the same as using options in the left navigation pane. the right side of the screen tells you if your version of excel has been activated or if you are in a trial. you can access your product key and product version using this panel (see figure 1.18). there are many benefits to the new backstage view with only a few drawbacks. those drawbacks, such as requiring an extra click to get to the recent files list, usually have workarounds. you should find that the backstage view will receive a better welcome than the initial backlash against the ribbon.the fi le menu becomes the backstage view 22 1 part figure 1.18 get updates, find your current ver-sion, or activate office on the help panel.2 the ribbon interface and quick access toolbar if you have upgraded directly to excel 2010 from excel 2003 or earlier, you are going through the shock of discovering that the familiar file, edit, view, insert, format, tools, data, window, and help menus, along with the standard and formatting toolbars, are gone from excel 2010. in their place, microsoft introduces the ribbon. using the ribbon the ribbon is composed of seven per-manent tabs labeled home, insert, page layout, formulas, data, review, and view. each tab is broken into rectangular groups of related commands. the group shown in figure 2.1 is the clipboard group. the mantra of the ribbon is to use pictures and words. many people have seen the little whisk broom icon in previous ver-sions of office but never knew what it did. if you hover over the icon in excel 2003, the tooltip tells you that it was the format painter, which at least gave you a place to start looking in the help file if you were really curious. in excel 2010, the same icon has the words “format painter” next to the icon. when you hover, the tooltip offers paragraphs explaining what the tool does. the tooltip offers a little-known trick: you can double-click the format painter to copy the formatting to many places. the tooltip offers a link to the help topic about the format painter. all these steps are designed to help more people find and make use of the format painter tool. in figure 2.1, the cut icon is a pure command. you click the icon and excel cuts the selection onto the clipboard. note figure 2.1 shows a bit of detail of the left side of the home tab of the ribbon.the ribbon inter face and qui ck access toolbar 24 1 part in contrast, the paste and copy icons are a new type of element in that each represents a hybrid command. figure 2.2 is shot with the mouse pointer hovering over the paste icon. you see that this icon is actually two icons. the top half of the icon is the actual paste command. the bottom half of the icon is a drop-down menu offering other types of paste commands (see figure 2.3). figure 2.1 detail of the clipboard group of the home tab of the ribbon. figure 2.3 click the bottom of the paste icon to access more paste options. figure 2.2 the paste icon is actually two icons; a paste command and a paste drop-down. using dialog launchers and the 80/20 rule the ribbon is designed to make it easier to discover features that should be used by most people using excel. it is not designed to hold every command available in microsoft excel. in many cases, even the middle-of-the-road exceller needs to go beyond the commands on the ribbon.25 us ing the ribbon 2 chapter a special symbol in the lower-right corner of many ribbon groups takes you directly to the dialog box with many more choices than those offered in the ribbon. figure 2.4 shows detail of the number group of the home tab. in the lower-right corner is a tiny symbol. the symbol is the top-left corner of a box, with an arrow pointing downward to the right. this symbol is called a dialog launcher. dialog launcher figure 2.4 the dialog launcher takes you to additional options. when you click the dialog launcher, you go to a dialog box that often offers many more choices than those available in the ribbon. in figure 2.5, you see the number tab of the format cells dialog. dialog launchers are not the only way to access dialog boxes. flyout menus and galleries offer their own way to reach dialog boxes. figure 2.5 click the dialog launcher to get to the full dialog box with all the choices. using flyout menus and galleries the ribbon introduces two new kinds of controls; visual flyout menus and galleries. figure 2.6 shows the visual flyout menu that appears when you open the conditional formatting drop-down on the home tab. you can see that the initial menu continues to offer pictures andthe ribbon inter face and qui ck access toolbar 26 1 part words, with words next to the icons for data bars, color scales, and icon sets. when you hover over a selection, a flyout menu appears with more visual choices (see figure 2.6). figure 2.6 the flyout menus continue the theme of offering pictures and words. note the more rules option at the bottom of the data bars menu. the more command occurs at the bottom of many flyout menus. when you see more, you will know that the menu is offering you only a subset of options. click more to access all the options. another new element in the ribbon is the gallery control. galleries are used when there are dozens of options from which to choose. the gallery shows you a visual thumbnail of each choice. in figure 2.7, you see the first row of choices in the table styles gallery. notice the three arrows at the right end of the gallery. figure 2.7 a gallery control starts by showing one row of thumbnails but offers three arrow con-trols at the right end. you can use the up-arrow and down-arrow icons to browse through the gallery one row at a time. or you can press the third arrow to open the entire gallery and see all the choices, as shown in figure 2.8. again, at the bottom of figure 2.8, you have additional choices for new table style. this leads to a dialog box with all the options around table styles.27 the ribbon is constant ly changing 2 chapter the ribbon is constantly changing although you start out with ribbon tabs for file, home, insert, page layout, formulas, data, review, and view, you constantly see other tabs appearing and disappearing. further, as you resize your excel window, the icons change and resize. harnessing contextual ribbon tabs excel 2010 offers a whole series of commands for dealing with photographs that you insert into your worksheet. however, 90% of the people never bother to dress up their worksheets with clip art or pictures, so there really is no reason to show all the commands for working with photographs in the ribbon. there is one persistent command in the ribbon that deals with pictures. you can use the picture command on the insert tab of the ribbon to insert a picture. after you use that command to insert a picture, and provided that the picture is selected, a new tab called picture tools format appears. this tab offers a gallery with all sorts of tools for changing the appearance of the picture. see figure 2.9 for some detail from the picture tools format tab. figure 2.8 if you open the gallery control, you can scroll through more choices.the ribbon inter face and qui ck access toolbar 28 1 part here is the frustrating thing. as soon as you click outside of the picture, the picture is no longer selected and the picture tools format tab disappears. if you need to format an object and you cannot find the icons for formatting the object, try clicking the object to see if the contextual tabs appear. these are the context-sensitive tabs: • add-ins—this tab contains any menu items added through vba macros or add-ins. this tab appears when an add-in is loaded. • background removal—this tab is new in excel 2010 and is used to remove the background from a photograph. to access the tab, use the background removal icon on the picture tools tab. • chart tools—the chart tools tab includes three tabs: design, layout, and format. the design tab provides features to change an entire chart. by the time you get to the format tab, you are micromanaging small aspects of a chart. • drawing tools—this includes a format tab for working with shapes. to access the tab, select insert, shapes. • equation tools—the design tab appears when you use the equation editor. • header & footer—this tab appears when you edit the header or footer for a page in page layout view. to access the tab, you click view, page layout view and then click in either the header or footer zone. note that there is a shortcut to page layout view, located to the left of the zoom slider, in the lower-right corner of the screen. • ink tools—this tab contains pens commands for tablet pcs. • picture tools—this tab is available after you insert clip art or an image and select the illustra-tion. • pivot chart tools—after you insert a pivot chart, four new tabs are available: design, layout, format, and analyze. the first three of these tabs are similar to the chart tools tabs. the fourth contains the pivot table features. • pivottable tools—this includes two tabs: options and design. the major settings appear on the options tab. formatting options appear on the design tab. • print preview tools—this small tab appears as the only tab when you are in print preview mode. because print preview moved to the backstage view, this tab will be very elusive. • slicer tools—the options tab appears whenever one of the new excel 2010 visual filters for pivot tables is selected. figure 2.9 when a picture is selected, the picture tools format tab is avail-able.29 the ribbon is constant ly changing 2 chapter • smartart tools—in excel 2010, the former business diagrams has been renamed smartart. when you are working with organization charts or other smartart diagrams, two new tabs are available: design and format. • sparkline tools—the design tab appears whenever the current selection is in a sparkline. sparklines are tiny, word-sized charts that debuted in excel 2010. • table tools—the design tab allows for the formatting of a database in excel after it has been converted to a table. in excel 2010, tables replace excel 2003 lists. all these contextual tabs come and go as you select and clear certain items in excel. resizing excel changes the ribbon you need a monitor with a 1440-pixel-wide resolution to see the entirety of the ribbon. anytime that you view the ribbon at a smaller size, excel starts intelligently collapsing icons on the ribbon. figure 2.10 shows the styles and the cells group of the home tab at a 1280 resolution. at this resolution, the cell styles gallery visible at 1440 resolution has already collapsed into a large drop-down icon. figure 2.10 at 1280 resolution, the styles and cells groups appear as six large drop-down icons. at a smaller resolution, the size of the six icons gets smaller, but you still have words, as shown in figure 2.11. figure 2.11 at smaller resolutions, the icons shrink. eventually, those groups shrink into a single drop-down for the entire group, as shown in figure 2.12. figure 2.12 eventually, the entire group collapses to a single drop-down.the ribbon inter face and qui ck access toolbar 30 1 part if you continue to shrink the width of the excel window, the ribbon disappears altogether, as shown in figure 2.13. figure 2.13 below 300 pixels of width, microsoft figures that you are not working in the application anymore and hides the ribbon entirely. to see the ribbon changing when the window is resized, search for excel in depth 2 at youtube. solving common ribbon problems here are some common complaints about the ribbon and some advice on how to best deal with these issues. you cannot find a particular command on the ribbon here are some tips for working with the ribbon: • start on the home tab. all the commands on the old formatting toolbar are here, as well as most of the old insert and format menus. • pivot tables moved from the data tab to the insert tab. they don’t belong here. they belong on the data tab with all the other commands from the old data menu. • the four most popular commands on the excel 2003 insert menu are no longer on the insert tab. insert cells, insert rows, insert columns, and insert worksheet are all now found under the insert drop-down in the cells group of the home tab. • commands from the old tools menu are generally found on the review tab. • commands from the old window menu are generally found on the view tab. • macro commands are on a developer tab of the ribbon that is hidden by default. see chapter 4, “customizing the ribbon,” for how to bring the developer tab back. you still cannot find the command on the ribbon following are many strategies for finding the command: • if you remember the old excel 2003 keyboard accelerators, try typing those. for example, alteij still invokes edit, fill, justify, even though there is no longer an edit tab on the ribbon. • right-click the ribbon and select customize. in the customize dialog, select all commands from the left drop-down. scroll through the list of all commands until you find the command. hover31 solving common ribbon problems 2 chapter over the command. a tooltip appears, showing you where you can find the command. in figure 2.14, the justify command is located on the home tab, in the editing group, under the fill drop-down. ribbon tab group command (drop-down) figure 2.14 this tooltip in the customize dialog shows you where to find a com-mand. • download my full-color tip card that maps every old command in the excel 2003 menu and toolbars to a ribbon tab. you can access the tip card at http://www.mrexcel.com/excel2007tip-card.html. • microsoft has an interactive ribbon guide that can help you locate an excel 2003 menu command on the ribbon. type interactive ribbon guide in any search engine to find the latest incarnation of the ribbon guide. the ribbon takes up too many rows i don’t want to argue with you, but the new ribbon only appears to take up a lot of space. it does not take up more space than the excel 2003 menu, formatting, and standard toolbars (provided you showed those toolbars on two rows). however, you can minimize the ribbon using one of these techniques: click the caret icon on the right side of the ribbon, just to the left of help (see figure 2.15). right-click the ribbon and select minimize the ribbon. note sometimes, a command truly is not in the ribbon. if you hover over a command in the customize dialog and it indicates that it is a command that is not in the ribbon, you will have to use customize to add this command to the quick access toolbar or to the ribbon.the ribbon inter face and qui ck access toolbar 32 1 part when the ribbon is minimized, you see only the words home, insert, page layout, and so on (see figure 2.16). after you click a tab, the ribbon temporarily reappears. when you finish selecting a command, the ribbon goes back to the minimized size. figure 2.15 caret icon to minimize the ribbon. figure 2.16 when the ribbon is minimized, you see only the tab names. you do not like where something is located on the ribbon i am with you on this one. pivot tables belong on the data tab, not on the insert tab. further, if you could take the left half of the home tab and combine it with the right half of the data tab, most people would hardly ever have to leave that one tab. the great news in excel 2010 is that you can now customize the ribbon. see chapter 4 to learn more about customizing the ribbon. you cannot see all your favorite commands at once a problem with the ribbon is that only one-seventh of the commands are visible at any given time. you will find yourself moving from one tab to another. the alternative is to use the quick access toolbar. using the quick access toolbar the quick access toolbar is a customizable toolbar. it remains visible, no matter which tab is cur-rently displayed. because the quick access toolbar is always visible, you can store your most used commands and have them always visible. there are probably a handful of toolbar buttons that you use constantly. for me, the list would be sort ascending, print, filter by selection, align right, open recent files, and decrease decimal. if you tried to locate these five commands, you would find that they are spread throughout the ribbon interface.33 us ing the qui ck access toolbar 2 chapter luckily, the quick access toolbar comes to the rescue. the quick access toolbar holds up to 90 of your favorite icons. it is always visible on the screen, so you can access its icons without needing to change to a different tab. changing the location of the quick access toolbar the quick access toolbar is initially displayed above the left side of the ribbon. initially, the menu offers save, undo, redo, and quick print icons. figure 2.17 shows the initial location and configura-tion of the quick access toolbar. figure 2.17 the default location of the quick access toolbar is above the ribbon. the other option is to display the quick access toolbar immediately below the ribbon. to do so, click the drop-down arrow at the right edge of the quick access toolbar. then select show below the ribbon, as shown in figure 2.18. figure 2.18 you can move the quick access toolbar below the ribbon. when the quick access toolbar is below the ribbon, you can use a similar method to move the quick access toolbar back above the ribbon: click the drop-down on the right side of the quick access toolbar and select show above the ribbon.the ribbon inter face and qui ck access toolbar 34 1 part adding favorite commands to the quick access toolbar the drop-down shown in figure 2.18 offers 12 popular commands that you might choose to add to the quick access toolbar. of my six desired icons, three are already available in that list. when you find a command in the ribbon that you are likely to use often, you can add the command to the quick access toolbar. to do so, right-click any command in the ribbon and select add to quick access toolbar. for example, to add align right to the quick access toolbar, follow these steps: 1. access the home tab. 2. right-click the align right icon. 3. select add to quick access toolbar . items added to the quick access toolbar using the right-click method are added to the right side of the quick access toolbar. knowing which commands can be on the quick access toolbar you can add commands to the quick access toolbar, but you cannot add the contents of many lists on the ribbon. figuring out what you can and cannot add to the quick access toolbar requires a bit of experimentation. for example, consider the orientation icon in the alignment group of the home tab. you can use this icon to angle text counterclockwise. the icon also contains a drop-down that has a total of six commands: angle counterclockwise, angle clockwise, vertical text, rotate text up, rotate text down, and alignment. if you right-click the alignment icon and choose to add it to the quick access toolbar, the entire icon, along with the drop-down of six commands, is added. if you don’t want to add the entire drop-down to the quick access toolbar, you can instead open the drop-down and right-click one of the six items. just this individual command is added to the quick access toolbar. however, other drop-downs are not drop-downs of commands. instead, they may lead to drop-downs with list boxes. for example, consider the font size drop-down. it is possible to add the entire font size drop-down as an icon on the quick access toolbar, but it is not possible to add to the quick access toolbar individual items from the list. you can add the drop-down itself, but you can-not, for example, add to the quick access toolbar an item that changes the font to 16 points. if you right-click an item in a list and the context menu doesn’t offer the ability to add it to the quick access toolbar, this is probably not a real command. removing commands from the quick access toolbar you can remove an icon from the quick access toolbar by right-clicking the icon and selecting remove from quick access toolbar. you can also remove icons by using the excel options dialog, as discussed in the following section.35 customi z ing the qui ck access toolbar 2 chapter customizing the quick access toolbar you can make minor changes to the quick access toolbar by using the context menus, but you can have far more control over the quick access toolbar if you use the customize command. you right-click the quick access toolbar and select customize quick access toolbar to display the quick access toolbar section of the excel options dialog, as shown in figure 2.19. the excel options dialog offers many features for customizing the quick access toolbar: you can choose to customize the quick access toolbar for all documents on your computer or just for the current document. you can add separators between icons to group the icons logically. you can resequence the order of the icons on the toolbar. you can access 1,286 commands, including the commands from every tab and commands that are not available in the ribbon. you can reset the quick access toolbar to its original default state. you can move the quick access toolbar to appear above or below the ribbon. figure 2.19 you can com-pletely customize the quick access toolbar using the excel options dialog.the ribbon inter face and qui ck access toolbar 36 1 part using the excel options to customize the quick access toolbar for all workbooks in the default state, the customize quick access toolbar drop-down in the excel options dialog is set to for all documents (default). this means that any changes you make to the quick access toolbar will apply to all excel documents opened on this computer. initially, the choose commands from drop-down shows popular commands. this drop-down lists every tab, plus two useful selections. if you select all commands, you get an alphabetical list of every possible command. if you select commands not in the ribbon, you see only the commands that you might have used in excel 2003, but did not make it to a tab. to add a new icon to the quick access toolbar for all workbooks, follow these steps: 1. choose the proper command subset from the choose commands from drop-down. 2. select the icon in the choose commands from list box. you might have to scroll to see the com-plete list. 3. click the add button to add the command to the customize quick access toolbar. the top choice in each category is a value called separator. you can add this to the customize quick access toolbar list to create a vertical bar between icons on the quick access toolbar. customizing icons for the current workbook only suppose you have 10 icons on your quick access toolbar for all workbooks. if you add additional icons for this workbook only, the icons appear after the 10 icons for all workbooks. to add these additional icons, follow these steps. 1. right-click the quick access toolbar or most places in the ribbon and select customize quick access toolbar. 2. with the customize quick access toolbar drop-down set to for all documents, select the separator icon at the top of the choose commands from list box. click add to add a vertical line at the end of the “all workbooks” section of the quick access toolbar. 3. from the customize quick access toolbar drop-down, select for (this workbook name). 4. use the choose commands from drop-down to find particular categories. 5. select an icon in the customize quick access toolbar list. 6. click the add button. 7. repeat steps 4–6 as needed. 8. click ok to complete the operation. when you finish with this process, the quick access toolbar shows the icons that apply to all work-books, a vertical separator, and then icons that apply only to the current workbook. tip if you are going to use icons for all workbooks and additional icons for this workbook only, you might want to end the workbook icons with a separator to help identify where the icons for this workbook only begin.37 customi z ing the qui ck access toolbar 2 chapter if you have the current workbook open and then switch to another open workbook, the icons assigned to the current workbook are hidden. if you arrange the windows so that you can see many workbooks at the same time, the icons for the current workbook stay visible as long as the work-book is active. filling up the quick access toolbar the quick access toolbar allows about 90 icons and/or separators. this is more than will fit across the screen on most monitors. if the monitor is not large enough to display all the icons, the first 54 of them are shown onscreen, and the rest are hidden behind a double arrow at the right edge of the quick access toolbar. rearranging icons on the quick access toolbar you can rearrange icons on the quick access toolbar by using the excel options dialog. select any icon from the customize quick access toolbar list box and then use the up-arrow or down-arrow buttons on the far right side of the dialog to move the icon up or down. resetting the quick access toolbar if you start a new job and inherit someone else’s computer, you might want to start with a fresh slate of icons on the quick access toolbar. to reset the quick access toolbar to the default configu-ration, follow these steps: 1. with the customize quick access toolbar drop-down set to for all documents (default), click the reset button at the bottom of the customize quick access toolbar list box. a warning box asks if you want to reset the list for all workbooks. 2. click ok, and the list returns to the three default icons (save, undo, and repeat). 3. with the customize quick access toolbar drop-down set to for (this workbook), click the reset button at the bottom of the customize quick access toolbar list box. a warning box asks if you want to reset the list for this particular document. 4. click ok, and the list clears. assigning vba macros to quick access toolbar buttons typically, a vba macro is assigned to a shortcut key. in legacy ver-sions of excel, it was easy to customize the menu system to add commands to invoke macros. in legacy versions of excel, more than 4,000 different icons were available for the various custom menu items. excel 2010 offers a weak interface for adding custom macros to the quick access toolbar. in the excel options dialog is a drop-down called macros. if you select this group, you see all public mac-ros in all open workbooks. you can select a macro and click add to add that macro to the quick access toolbar. note i am not sure why you would choose to use the print icon for your macros. considering that you used to have 4,096 choices and now have only 180 choices, this is another area where excel 2010 does not live up to the legacy versions.the ribbon inter face and qui ck access toolbar 38 1 part initially, every macro added to the quick access toolbar gets an identical icon. however, you can select an icon in the customize quick access toolbar list box and click the modify button. the modify button dialog box that appears allows you to choose from 55 available icons for a macro. most of these buttons are similar to icons that are already popular. for example, the print icon is fairly well known and has a meaning. in addition to choosing from the 55 icons, you can type any text for a display name, as shown in figure 2.20. the display name does not appear next to the button. but if you hover your mouse over the icon on the quick access toolbar, you can see the display name in a tooltip. figure 2.20 for macros, you can customize the button image and add a dis-play name on the quick access toolbar. troubleshooting excel: the downside of the quick access toolbar although the quick access toolbar is cool, it has a few drawbacks compared to the legacy ver-sions of excel. in legacy versions of excel, you could choose to display an icon, words, or words and an icon for items that you added to the custom toolbar. with excel 2010, you are pretty much limited to just icons. when microsoft explained the need for the ribbon, it was pretty confident that pictures and words were far superior to just pictures. it seems curious that they deprecated the ability to put words on the customizable quick access toolbar. with legacy versions of excel, holding down shift while clicking an icon usually invoked the reverse of the icon. for example, clicking shiftsort ascending would perform sort descending. this functionality has been removed from excel 2010. with legacy versions of excel, you could easily have multiple custom toolbars, and those tool-bars could be docked to any side of the screen or floating above your worksheet. with excel 2010, you have limited customizations to the ribbon and the quick access toolbar. the location of these elements must always be above the worksheet.3 using other excel interface improvements although the backstage view is likely to be the most talked-about features in the excel 2010 interface, many of these changes added to excel 2007 will be dra-matic if you are upgrading from excel 2003: • live preview— you can preview formatting changes before you actually select the change. • paste options— a newly expanded paste options menu will introduce many new popular shortcut key sequences to excel. • mini toolbar— the mini toolbar appears whenever you select text. although this may happen rarely when you edit cells in excel, it does happen fre-quently when you work with charts, text boxes, and so on. the mini toolbar offers quick access to font, size, bold, italics, alignment, color, indenting, and bullets. • formula bar— the formula bar includes the capability to expand or contract itself at your whim instead of the whim of excel. • zoom slider— the zoom slider allows you to quickly change from seeing one page to hundreds of pages at a time. • status bar— the status bar appears at the bottom of your worksheet window. although you probably never noticed it, the status bar in legacy versions of excel reported the total of any selected cells. this information is now improved and expanded in excel 2010. • view control— the view control gives you one-click access to page break preview mode, normal mode, and the new page layout view. • new sheet icon— the new sheet icon allows you to add new worksheets to a workbook with a single click.us ing other excel inter face improvements 40 1 part using live preview live preview provides an answer to the question, “what would this menu selection look like in the worksheet?” if you open a drop-down list from the ribbon, live preview is often enabled. as you hover over vari-ous items in the list, the selected cells in the worksheet automatically preview what they would look like if you selected that option. when you hover over a different selection, live preview quickly changes to reflect the new selec-tion. figure 3.1 shows a live preview of some text in the copperplate gothic bold font. figure 3.1 if you hover for a second over one font, excel shows a preview of the font in the selected cells. the preview shown with live preview is not permanent. if you hover over another item, the view of the workbook changes. in the process of scrolling down to the showcard gothic font shown in figure 3.2, for example, excel flips through dozens of previews of the data. if you close the list box without clicking an item, the workbook reverts to the original style. the live preview feature works with items as mundane as the font size and font face drop-downs. it also works with many features, such as table format and conditional formatting color scales. there are a few items with which live preview does not work. when you try to decide on a chart type, for example, live preview is not enabled. this is because excel would actually have to add an object to the sheet to show you the preview. there are also a few drop-downs where live preview doesn’t work. microsoft figures you know what an underline or double underline would look like, so this drop-down does not allow live preview. you can choose to turn off live preview if it annoys you. to do this, select file, options. the second check box in excel options allows you to turn off live preview.41 previewing paste us ing the paste opt ions gal lery 3 chapter previewing paste using the paste options gallery here is a quick survey: have you ever opened a notepad window, pasted your data to notepad, copied from notepad, and then pasted to your application? this is a great way to remove formatting from a selection. if you have discovered that painful workaround, you are going to love this next fea-ture that was added to excel 2010. here is another survey: suppose that you have to copy a column of formulas and paste it as values. do your fingers know how to do ctrlc, altesventer? if so, you are going to love the new ctrlv, ctrl, v keystrokes available in the paste options gallery. if you’ve ever done ctrlc, altesventer, altestenter, you will love the new contexte keyboard shortcut. as someone who uses both of those old keyboard shortcuts frequently, i love the new paste options gallery. you can keep slicers, sparklines, even powerpivot; the paste options gallery is going to be the one feature that makes a difference in my life every single hour of every single workday. microsoft discovered that paste was the number one command that was immediately followed by undo. to improve the paste command, microsoft added three paste options galleries to excel 2010. these galleries support live preview and keyboard shortcuts. they should make mouse-centric and keyboard-centric people very happy. you encounter the gallery when you have something on the clipboard and one of these three events happens: • you right-click a cell to access the context menu. • you open the paste drop-down from the home tab. • after performing a typical paste operation, the old paste repair menu icon appears with the tip that you can press ctrl to access the gallery. figure 3.2 if you move to a new font the preview changes.us ing other excel inter face improvements 42 1 part accessing the gallery after doing a paste suppose that you copy a range with ctrlc and then paste with ctrlv. the icon for the old paste repair appears next to the paste, but this time it notes that you can open the menu by pressing ctrl. when you press ctrl, you are presented with a gallery of paste options, as shown in figure 3.3. figure 3.3 the gallery offers 14 different ways to paste the data. an elusive 15th icon appears when your pasted data includes conditional formatting. every one of those icons has a descriptive tooltip and a keyboard accelerator. figure 3.4 shows that the keyboard accelerator for doing paste values and then paste formats is e. this means that you can use my new favorite keyboard sequence: ctrlv to paste, ctrl to open the paste options gallery, e to paste values and formatting. figure 3.4 every icon has a shortcut key. table 3.1 lists the shortcut keys for the icons in the gallery. table 3.1 keyboard shortcuts for paste options icon key action p paste f formulas o formulas & number formatting k keep source formatting43 previewing paste us ing the paste opt ions gal lery 3 chapter icon key action b no borders w column widths t transpose v values a values and number formatting e values & source formatting r formats n paste link u paste as picture i paste as linked picture s opens paste special in case you have not run into some of these options before, here is a quick synopsis of each option: • paste—this is the standard paste that you would get using ctrlv. • formulas—aste only formulas but no formatting. this is common when you are copying down from the first row of a table that has an outline border. to prevent the top border from copying, you can paste formulas. you then find that you have to reapply the number formatting. • formulas & number formatting— copies formulas as previous formulas, along with the number formatting. • keep source formatting— this is particularly useful when copying from another application such as a web page. the formatting from the other application will be pasted along with the values. • no borders— paste everything but the borders. • column widths— include the column widths from the copied area. • transpose—turn the data on its side. a 12-row by 1 column copied range would paste as 1 row by 12 columns. • values—convert formulas to values. • values and number formatting— convert the formulas to values and include the number for-mats from the copied data. • values & source formatting— convert the formulas to values and include all formatting such as cell styles, font color, number formatting, and borders.us ing other excel inter face improvements 44 1 part • formats—do not bring any values, only the cell formatting. similar to using the format painter but not as annoying. • paste link— create formulas here that point back to the copied range. • paste as picture— paste a picture of the original cells in this location. • paste as linked picture— paste a live picture of the original cell in this location. this is the elusive camera tool from excel 2003. • open paste special— access the old paste special dialog. the paste special dialog still offers some choices not available in the paste options gallery: comments, validation, all using source theme, add, subtract, multiply, divide, and skip blanks. figure 3.5 shows the paste special dialog. to see a demo of the paste options menu, search for excel in depth 3 on youtube. figure 3.5 a few options in paste special are not covered in the paste options dialog. accessing the paste options gallery from the right-click menu the paste options gallery appears in the right-click context menu and includes live preview. as shown in figure 3.6, the top six options appear directly in the menu. a flyout menu offers all 14 options. as you start to hover over the values options, live preview takes over. the rest of the context menu disappears so that you can see the worksheet. hover over transpose and you get a preview of what transpose actually does (see figure 3.7). hover over formatting and you see that the formatting option copies only the cell formats and not the numbers (see figure 3.8).45 previewing paste us ing the paste opt ions gal lery 3 chapter figure 3.6 right-click to access this menu. figure 3.7 hover over an icon to see a live preview of the paste. figure 3.8 move to another icon to see how that paste would work.us ing other excel inter face improvements 46 1 part if you hover over paste special and then move out to the full gallery, all the context menu except the full gallery disappears, and live preview continues to work (see figure 3.9). figure 3.9 hover over the full gallery to access all 14 options with live preview. why keyboard-centric people like the context gallery i am not a right-click person. i always use keyboard shortcuts instead of the mouse. i can press altesventer before most people can even move their hand over to the mouse. take a close look at your keyboard. to the left of the spacebar, between the fn and alt keys, do you have the flying windows key? i’ve memorized a few of those shortcuts, like wine to open windows explorer. now, look over to the right of the spacebar. what do you have between alt and ctrl there? i have a key which i had never used before today. this key looks like the right-click menu and is the context menu key. when i press that key in excel, the right-click menu appears in the worksheet. those six icons in the paste options gallery in the right-click menu each have a keyboard accel-erator: p—normal paste v—paste values f—formulas t—transpose r—formats n—paste link this means that there is an even faster keyboard method for converting formulas to values. press ctrlc to copy, press the context key and then v to convert to values. you would prob-ably have to use two hands, ctrlc with the left hand, context with the right hand, v with the left hand. it would take a little practice until this was as fast as ctrlc, ctrlv, ctrl, v, but it is worth a shot if you rely on keyboard shortcuts to speed your way through tasks.47 us ing the mini toolbar to format selected text 3 chapter accessing the paste options gallery from the paste drop-down the paste options gallery also appears when you open the paste drop-down on the home tab. figure 3.10 shows the menu there. figure 3.10 the gallery replaces the old paste drop-down in the home tab. using the mini toolbar to format selected text the mini toolbar is a shy attendant. when you select some text, almost imperceptibly, the mini toolbar faintly appears above the text. if you ignore the mini toolbar, it fades away. however, if you move the mouse toward the mini toolbar, the toolbar solidifies and offers you several text formatting options. in your initial use of excel 2010, you might not see the mini toolbar. although you often select cells or ranges of cells, it is rare to select only a portion of a cell value in cell edit mode. however, as you begin using charts, smartart diagrams, and text boxes, you will have the mini toolbar appearing frequently. to use the mini toolbar, follow these steps: 1. select some text. if you select text in a cell, you must select a portion of the text in the cell by using cell edit mode. in a chart, smartart diagram, or text box, you can select any text. the mini toolbar appears faintly. on some computers and with some color schemes, “faintly” actually means “completely transparently.” 2. move the mouse pointer toward the mini toolbar, and the toolbar solidifies. the mini toolbar stays visible if your mouse is above it. after a period of inactivity, it disappears. if you move the mouse away from the mini toolbar, it fades away. 3. make changes in the mini toolbar to affect the text you selected in step 1. the mini toolbar always has the same icons, even though some of them may not apply in the current situation. n ote for those of you who started using the values or transpose options in the excel 2007 paste drop-down, you might be frus-trated that these confusing array of pictures appears instead of words. after a few days of transi-tion, you will start to get used to which pictures indicate which commands.us ing other excel inter face improvements 48 1 part 4. when you are done formatting the selected text, you can either move the mouse away from the mini toolbar or use the format painter icon to apply the changes to additional text. 5. to use the format painter icon, click the paintbrush in the lower-right corner of the mini toolbar. then move toward other text in the document. as shown in figure 3.12, the mouse pointer is a black-and-white paintbrush, to indicate that you are in format painter mode. when you click the other text, excel applies the same formatting to the new text. initially, it is difficult to see the mini toolbar. you have to move the mouse toward the upper right to get the toolbar to solidify. in the top row, the mini toolbar offers eight controls: • font name drop-down— you open this drop-down to choose a typeface. each of the various font names is displayed in its own font so that you can select an appropriate font easily. • font size drop-down—this drop-down offers font sizes from 8 to 96, in several increments. in figure 3.11, for example, it does not make sense to apply indenting to the smartart, but the icons are always there and in the same place. note microsoft began experimenting with fading toolbars in outlook 2003. in that version, a new message toolbar would fade into view in the lower-right corner of your screen. you could glance down and read the first line of the email. you could ignore the toolbar, and the message would be waiting for you later in your inbox. or you could move the mouse toward the notifier, and it would stay long enough for you to click delete or open. i enjoyed this feature of outlook 2003. if my attention needed to stay on the task at hand, i could ignore the notifier, and it would unob-trusively fade away. however, if i were waiting for a message, i could handle it as it came in, avoiding a buildup of messages in my inbox. the new mini toolbar is another feature that fades in if you move toward it and fades out if you ignore it. i expect to see more fade-in/fade-out features in future versions of office. figure 3.11 the mini toolbar appears when you select text and move up and to the right. format painter mouse pointer figure 3.12 after using the mini toolbar to format some text, use the format painter to copy that formatting to other text.49 us ing the mini toolbar to format selected text 3 chapter • increase font size icon— you click this icon to bump the font up to the next larger size. • decrease font size icon—you click this icon to make the font one size smaller. • decrease indent icon— change the list level. • increase indent icon— change the list level. • bring forward icon— used for shapes and objects. • send backward icon— used for shapes and objects. in the bottom row, the mini toolbar offers nine controls: • bold icon— use this to toggle bold on and off. if bold is already applied, the bold icon has a glow effect around it. • italics icon— use this to toggle italics on and off. • align left icon— click this control to left-align the text. • center align icon— click this control to center the text. • right align icon— click this control to right-align the text. • font color drop-down— use this drop-down to select a color. a menu item at the bottom of this drop-down allows you to display the colors dialog box. • fill color drop-down— use this drop-down to select a fill color. • shape outline drop-down— use this drop-down to change the color and style of any line in the shape. • format painter— the format painter allows you to copy formatting from one place to another. the format painter is discussed in detail in the following section. getting the mini toolbar back the shyness of the mini toolbar might be the most frustrating part of using it. if you move the mouse away from the mini toolbar, it fades away. if you immediately move back toward the mini toolbar, it comes back. if you use the mouse for some other task, such as scrolling, the mini toolbar permanently goes away. in this case, you might have to reselect the text to get the mini toolbar to come back. disabling the mini toolbar if you are annoyed by the mini toolbar, you can turn it off for all excel workbooks. to do this, select file, options. the first choice in excel options is a check box for show mini toolbar on selection. clear this check box. caution using the format painter icon is difficult to master. you get only one click to apply the format-ting. if you inadvertently click a nontext element, you lose the format painter mouse pointer. tip the format painter command in the clipboard group of the home tab is a bit easier to use than the format painter icon. you can double-click this command to keep the application in format painter mode. you are then free to click multiple objects, applying the format to various elements.us ing other excel inter face improvements 50 1 part expanding the formula bar formulas range from very simple to very complex. as people began writing longer and longer for-mulas in excel, an annoying problem began to appear: if the formula for a selected cell was longer than the formula bar, the formula bar would wrap and extend over the worksheet (see figure 3.13). in many cases, the formula would obscure the first few rows of the worksheet. this was frustrating, especially if the selected cell was in the top few rows of the spreadsheet. figure 3.13 in legacy versions of excel, the formula bar could obscure cells on a worksheet. in this case, both the active cell, b4, and the dependent cell, c4, are hidden. excel 2010 features a new formula bar that prevents the formula from obscuring the spreadsheet. for example, in figure 3.14, cell e4 contains a formula that is longer than the formula bar. notice the two new controls at the right end of the formula bar: a scroll-bar and expand formula bar icon (which looks like a downward-pointing double arrow). tip clicking the % indicator to the left of the zoom slider opens the legacy zoom dialog. figure 3.14 by default, excel 2010 shows the initial portion of the formula. you use the formula bar scrollbar to scroll through the formula, one line at a time. you use the expand formula bar icon to expand the formula bar. as shown in figure 3.15, expanding the for-mula bar actually moves the grid down. this way, you can see the formula bar and still see the cells in the grid, too. in expanded mode, the expand formula bar icon is replaced by a double up-pointing arrow that you can use to contract the formula bar back to one line. figure 3.15 you can click a button to expand the formula bar.51 zooming in and out on a worksheet 3 chapter in figure 3.15, you are not seeing the entire formula. the line underneath the formula bar can be dragged up or down to increase the size of the formula bar, as shown in figure 3.16. figure 3.16 drag the bar underneath the for-mula bar to expand the formula bar even more. zooming in and out on a worksheet in the lower-right corner of the excel window, a new zoom slider allows you to zoom from 400% to 10% with lightning speed. you simply drag the slider to the right to zoom in and to the left to zoom out. the zoom out and zoom in buttons on either end of the slider allow you to adjust the zoom in 10% increments. figure 3.17 shows the zoom control set to the maximum zoom of 400%. figure 3.17 you can use the zoom slider or the zoom out and zoom in but-tons to change the zoom. at the opposite end of the zoom spectrum, the 10% view shows an overview of 158 printed pages of the worksheet. as shown in figure 3.18, you cannot make out any numbers at a 10% zoom. however, in the 40%–60% zoom range, you can see 3 to 10 pages and actually make out the num-bers in the cells. figure 3.18 at 10% zoom, you can see 150 pages at once.us ing other excel inter face improvements 52 1 part using the status bar to add numbers if you select several cells that contain numeric data and then look at the status bar, at the bottom of the excel window, you can see that the status bar reports the average, count, and sum of the selected cells (see figure 3.19). if you need to quickly add the contents of several cells, you can select the cells and look for the total in the status bar. this feature has been in excel for a decade, yet very few people realized it was there. in legacy versions of excel, only the sum would appear, but you could right-click the sum to see other values, such as the average, count, minimum, and maximum. figure 3.19 the status bar shows the sum, average, and count of the selected cells. as with legacy versions of excel, in excel 2010 you can customize which statistics are shown in the status bar. in excel 2010, you can configure all the status bar elements. to do so, you right-click the status bar to display the status bar configuration panel. in this panel, you can see the current value of all status bar icons, whether they are hidden (see figure 3.20). to add new items to the status bar, you click them in the status bar configuration panel. switching between normal view, page break preview, and page layout view modes three shortcut icons in the status bar allow you to quickly switch between three view modes, as shown in figure 3.21: • normal view— this mode shows worksheet cells as normal. • page break preview— this mode draws the page breaks with blue. you can actually drag the page breaks to new locations in page break preview. this mode has been available in sev-eral versions of excel. • page layout view— this is a new view introduced in excel 2007. it combines the best of page break preview and print preview modes. tip because it is possible to navigate and enter formulas in any of the view modes, you might want to do actual worksheet editing in the new page layout view mode.53 swi tching between normal view, page break preview, and page layout view modes 3 chapter in page layout view mode, each page is shown, along with the margins, header, and footer. a ruler appears above the pages and to the left of the pages. (see figure 3.22.) you can make changes in this mode in the following ways: • to change the margins, drag the gray boxes in the ruler. • to change column widths, drag the borders of the column headers. • to add a header, click click to add header. figure 3.20 you can configure the status bar to show or hide all these indicators. figure 3.21 three view shortcuts appear in the status bar.us ing other excel inter face improvements 54 1 part using the new sheet icon to add worksheets the final new control in the excel interface is the insert worksheet icon. this icon appears as a small worksheet tab with a new icon. the tab appears to the right of the last worksheet tab, as shown in figure 3.23. you click the icon to add a new worksheet to the end of the workbook. figure 3.22 the new page layout view mode gives a view of page breaks, mar-gins, headers, and footers. figure 3.23 add a new worksheet to the end of your workbook by using the new worksheet icon. dragging a worksheet to a new location after a worksheet has been added to the end of the workbook, you can drag the sheet to a new location in the middle of the workbook. follow these steps to move a worksheet to a new location: 1. click the worksheet tab. 2. drag the mouse left or right. the mouse pointer shows a sheet of paper under the mouse pointer.55 us ing the new sheet icon to add worksheets 3 chapter 3. watch for the insertion triangle just above the row of sheet names. in general, the insertion tri-angle will indicate that the sheet is dropped to the left of the sheet you are hovering above. 4. when the insertion triangle is in the correct location, release the mouse button. the worksheet will be moved to the new location. inserting a worksheet in the middle of a workbook although using the new worksheet icon and then dragging a worksheet to a new location is easier, you can also insert a worksheet in a particular location. to insert a worksheet to the left of the cur-rent worksheet, for example, select home, cells, insert, insert sheet. the new sheet is added before the current sheet. alternatively, you can right-click any sheet and select insert. the insert dialog appears, where you can choose to insert a worksheet or a variety of templates. the new sheet appears to the left of the selected tab.this page intentionally left blank4 customizing the ribbon customizing the ribbon is back. for those of you who upgraded from excel 2003 directly to excel 2010, you will say, “well, of course you can customize the ribbon.” however, for those of us who lived through excel 2007, there was no easy way to customize the ribbon. starting with excel 97 command bars, it was possible to completely custom-ize excel 97 through excel 2003. you could add icons, remove icons, adjust the menu, create new toolbars, float the toolbars, and so on. power excellers every-where customized their environment to match their work style. in a blog post on june 27, 2006, jensen harris of microsoft’s ui team blogged that 1.3 million people were working with customized versions of office. but— that wasn’t enough. it would cost too much and take too much time to allow for customizations of the ribbon. if you were a programmer who understood xml, you could still change the ribbon using ribbonx. but the decision had been made that office 2007 would not support a customizable ribbon. thankfully, excel 2010 now offers a way to customize the ribbon. whether you want to add a few commands or design your own ribbon tab, you will be accom-modated in excel 2010. also, after you have designed the perfect customization, you can share that customization with others in your department. performing a simple ribbon modification suppose that you generally like the ribbon, but there is one icon that seems to be misplaced. for me, that icon is the pivottable command. i have no idea why this is on the insert tab instead of on the data tab where it belongs. take a look at the data tab as shown in figure 4.1. it would make sense to have the pivot table command right after the sort & filter group and before the data tools group.customi z ing the ribbon 58 1 part to add the pivot table command to the data tab, follow these steps: 1. right-click the ribbon and select customize the ribbon. 2. in the right list box, expand the data tab by clicking the sign next to data. 3. click the sort & filter entry in the right list box. the new group will go after this entry. 4. click the new group button at the bottom of the right list box. a new group (custom) item appears after sort & filter, as shown in figure 4.2. figure 4.1 decide where the new command should go. figure 4.2 commands have to be added to a new group. 5. while the new group is selected, click the rename button at the bottom of the list box. the rename dialog appears. 6. the rename dialog offers to let you choose an icon that does not make any sense in the current context. at the bottom, type a display name of “pivot.” click ok.59 us ing a more complex ribbon modi f i cat ion 4 chapter 7. the left list box is showing popular commands. scroll through that list to find pivottable. click pivottable in the left list box. click the add button in the center of the dialog to add pivottable to the new custom pivot group on the ribbon. 8. in the drop-down above the left list box, select all commands. the left list box changes to show an alphabetical list of all commands. 9. scroll through the left list box until you find pivottable and pivotchart wizard. this is the obscure entry point to create multiple consolidation range pivot tables. select that item in the left list box. 10. click add in the center of the dialog. the pivottable wizard command is added after the regular pivottable command. 11. while the wizard is selected in the right list box, select rename from the bottom of the right list box. choose a different icon and type a new name, “multiple consolidation.” at this point, the right side of the dialog should look like figure 4.3. figure 4.3 two new icons have been added to a new custom group on the data tab. 12. click ok. figure 4.4 shows the new group in the data tab of the ribbon. to see a demo of adding an icon to the ribbon, search for excel in depth 4 at youtube. figure 4.4 the results appear in the ribbon. using a more complex ribbon modification in my use of excel 2010, i find that i am constantly bouncing back and forth between the home tab and the data tab. if i could add the key icons from the data tab to the right side of the home tab, i would not have to move from the home tab.customi z ing the ribbon 60 1 part following is the list of icons that i regularly use on the data tab: • az to sort ascending • za to sort descending • the filter button, but i would rather have the filter by selection button instead • the advanced filter button • pivottable • text to columns • remove duplicates • goal seek from the what-if analysis drop-down • subtotal those are nine icons i would like to add to the home tab of the ribbon. after studying the icons, i think that i can get by with only icons and no words. this will allow me to jam several icons into a small space. on the home tab, i can do without the editing group. i know keyboard shortcuts for every command in that group that i use. follow these steps to customize the home tab: 1. right-click the ribbon and select customize the ribbon. 2. in the right list box, expand the home tab. 3. click the editing group. 4. click remove in the center of the dialog to remove the editing group. 5. click new group to add a custom group to the home tab. 6. click rename and call the group data. 7. from the drop-down above the left list box, select main tabs. the left list box now shows a list of the main ribbon tabs. 8. in the left list box, expand the data tab by clicking the plus sign. 9. expand the sort and filter group in the left list box. 10. click sort ascending and click add. repeat for sort descending and advanced (filter). 11. in the left list box, open the insert tab, then the tables group, and then the pivottable drop-down. add the pivottable command to the right side. 12. in the left list box, expand the data tools group of the data tab. add text to columns, remove duplicates to the right list box. expand the what-if analysis drop-down and add goal seek to the right list box.61 us ing a more complex ribbon modi f i cat ion 4 chapter 13. in the left list box, expand the outline group and add subtotal to the right list box. 14. in the drop-down above the left list box, select commands not in the ribbon. 15. in the left list box, find the icon labeled autofilter. this icon is actually filter by selection. in the right list box, select sort descending because you want to add filter by selection after sort descending. select autofilter in the left list box. click add. at this point, you have added nine icons to the data group on the home tab, as shown in figure 4.5. you have no control over whether the icons appear as small or large. when you eventually add enough icons, those icons will appear as small. figure 4.5 nine new icons appear in the home tab. to remove the labels and leave only icons, follow these steps: 1. right-click the ribbon and select customize the ribbon. 2. in the right list box find the custom data group in the home tab. 3. select the first icon and click rename. backspace through the name in the modify dialog box. 4. repeat steps 2 and 3 for the other eight icons. 5. click ok. the icons now appear as a small 3x3 arrangement of icons. you don’t have explicit control over the arrangement of the icons. however, in the customize dialog, you can drag an icon in a cus-tom group to a new location. in figure 4.6, the icons are rearranged so that the two filter icons appear together in the second column. there was also room for the editing group to fit on the home tab, so it was added back. note notice that words appear next to six icons and not next to the last three icons. this is a situation where excel is already starting to collapse the ribbon because the screen isn’t wide enough. if the monitor had a higher resolution, the last three icons would have labels as well. figure 4.6 this has to be the most powerful group of icons ever assembled.customi z ing the ribbon 62 1 part hiding/showing ribbon tabs you can temporarily remove a ribbon tab without deleting it. notice in the screenshots so far that the developer and powerpivot tabs are unchecked in the customize dialog. excel 2007 had an excel option for show developer tab in the ribbon. that command was removed from excel 2010 because you just add a check mark next to the developer entry in the right side of the customize dialog. adding a new ribbon tab to add a new ribbon tab, follow these basic steps: 1. right-click the ribbon and select customize the ribbon. 2. click new tab and rename the tab. 3. add new group(s) to the new tab. 4. add commands to the new groups. as you go through the steps to add a new ribbon tab, you will discover how absolutely limiting the ribbon customizations are. in figure 4.7, the power group of nine data icons from figure 4.6 now show up as large icons. the icons in the font group appear as small. you have no control over which groups appear with large icons and which groups appear as small. figure 4.7 for no apparent reason, the nine small icons become nine large icons. in a perfect world, the nine data icons would appear as small and the cell styles drop-down would appear as a gallery, as it does on a 1440-pixel-wide monitor. however, the cell styles always appears as a drop-down, even if it is the only thing on the entire ribbon tab (see figure 4.8). figure 4.8 when added to a custom group, a gallery appears as a drop-down. the workaround is to add an entire built-in group to the new ribbon. in figure 4.9, the styles group from the home tab is added as the only group in a new ribbon tab. because the built-in groups allow for different types of controls, the gallery appears in a wide format, as shown in figure 4.9.63 quest ions about ribbon customi zat ion 4 chapter sharing customizations with others if you have developed the perfect ribbon customization and you want everyone in your department to have the same customization, you can export all the ribbon customizations. to export the changes, follow these steps: 1. right-click the ribbon and select customize the ribbon. 2. below the right list box, select import/export, export all customizations. 3. browse to a folder and provide a name for the customization file. the file type will be .exportedui. click ok. 4. in windows explorer, find the exported .exportedui file. copy it to a co-worker’s computer. 5. on the co-worker’s computer, repeat step 1. in step 2, select to import all customizations. find the file and click ok. resetting customizations it is easy to go back to the default ribbon configuration. to do this, right-click the ribbon and select customize the ribbon. below the right list box, select reset, reset all customizations. click ok. questions about ribbon customization can the customizations apply to only a certain workbook? no. the customize the ribbon command in excel 2010 applies to all workbooks. can toolbars be docked to the side of the screen or floating as in excel 97–2003? no. the ribbon always must be at the top of the screen, in a horizontal position. where is the excel 2003 icon editor? where is the list of 4,096 icons available back in excel 2003? neither item is supported in excel 2010. how can i get complete control over the ribbon? learn ribbonx and write some vba to build your own ribbon. figure 4.9 when added as a built-in group, the complete gallery appears. note note that this is an all-or-nothing proposition. you cannot export your changes to the mrexcel tab without exporting your changes to the data and home tabs.customi z ing the ribbon 64 1 part for more information on building your own ribbon, see ribbonx: customizing the office 2007 ribbon by robert martin, ken puls, and teresa hennig (wiley, isbn 0470191112). these ribbon customizations are really lacking. is there another option that doesn’t require me to write a program? yes, there are a number of third-party ribbon customization programs. for example, check out a free one from excel mvp andy pope at www.andypope.info/vba/ribboneditor.htm.5 keyboard shortcuts if you do a lot of typing, being able to access commands from the keyboard is faster than moving your hand to the mouse. excel 2010 introduces new keyboard accelerators accessed using the alt key. in addition, many of the old alt keyboard shortcuts still work and all the old ctrl shortcut keys are still func-tional. for instance, ctrlc still copies a selection, ctrlx cuts a selection, and ctrlv pastes a selection. this chapter points out which of the old keyboard shortcuts still work, shows you some new shortcuts, and introduces you to the new keyboard accelerators. using new keyboard accelerators the goal of the new excel 2010 keyboard accelerators is to allow you to access every command by using only the keyboard. in legacy versions of excel, many popular commands had keyboard accelerators, but other commands did not. excel 2010 tries to ensure that every command can be invoked from the key-board. to access the new accelerators, press and release the alt key. notice that excel places a tooltip above each command, with an associated accelerator key. note that an arcane command exists in the excel options dialog that can cause the new keyboard accelerators not to work for you. it is possible that you turned on this setting in excel 1995 and each successive upgrade of excel has inherited the setting. you should check the setting before proceeding. to do so, select the office icon and then select excel options. in the advanced category, scroll to near the bottom for lotus compatibility. if transition navigation keys is selected, then the slash character shown in the microsoft office menu key will be used instead of alt to invoke shortcuts. if you prefer using the alt key, you should clear the transition navigation keys check box. keep in mind that if you prefer using the slash key, you must use / in place of alt with new keyboard accelerators.keyboard shor tcuts 66 1 part tiny letter tooltips appear over each tab of the ribbon. in addition, number tooltips appear over each icon in the quick access toolbar. figure 5.1 shows the tooltips. keyboard accelerators for ribbon tabs keyboard accelerators for qat icons figure 5.1 type the letters in the tooltips along the top to open various tabs. type the numbers in the numeric keytips to access the quick access toolbar. it is possible to memorize the keytips for the ribbon tabs. pressing altf always accesses the file menu in all office 2010 applications. alth always accesses the home tab in all office 2010 applications. the accelerator definitions for each tab remain constant even if new ribbon tabs are displayed. when you activate a pivot table, the original keytip letters (f, h, n, p, m, a, r, w, l, and x) remain, and two new keytips appear for the two new tabs: jt for pivottable tools options and jy for pivottable tools design (see figure 5.2). contextual tab shortcuts figure 5.2 new ribbon tabs get new letters, making sure the old let-ters remain constant.67 us ing new keyboard accelerators 5 chapter unfortunately, the keytips for the quick access toolbar change every time you add new buttons or rearrange buttons on the quick access toolbar. if you want to memorize those keytips, you need to make sure you do not add a new quick access toolbar icon at the beginning of the list. selecting icons on the ribbon after you press the alt key, you can press one of the keytip letters to bring up the appropriate tab. you now see that every icon on the ribbon has a keytip. when you choose a ribbon tab, the keytips on the quick access toolbar disappear, so microsoft is free to use the letters a through z and the numbers 0 through 9. on very busy ribbon tabs, some commands require two keystrokes: for example, ac for align center in the alignments group of the home tab, as shown in figure 5.3. note that after you press alt to display the accelerators in the tooltips, you do not have to continue holding down the alt key. some shortcut keys seem to make sense: at for align top, am for align middle, ab for align bottom, al for align left, w for wrap text, and m for merge. other shortcut keys seem to be assigned at random. some take a little pondering: fa for the dialog launcher in figure 5.3 makes sense in that it opens the legacy format dialog and moves to the alignment tab. others have a historical precedent. in excel 2003, f was used for file so o was used for f ormat. similarly, in the home tab, o now opens the format drop-down, although since microsoft no longer underlines the accelerator key in the menu name, o will never make sense to someone new to excel. there might be some arcane, logical reason why 5 and 6 are used for increase and decrease indent, but it is unknown by most people. figure 5.3 after pressing the letter to switch to the ribbon, type the letter or letters to invoke a particular command. selecting options from a gallery figure 5.4 shows the results of pressing altht, which is the equivalent of selecting home, format table. this opens the gallery of possible table styles. as you can see in figure 5.4, you can invoke the new table style and new pivot style commands at the bottom of the gallery by pressing n and p, respectively. however, there are no letters on the table style choices in the gallery. to select a table style using the keyboard, use the arrow keys to move through the gallery. because this gallery is two-dimensional, you can use the up arrow, down arrow, right arrow, left arrow, page down, page up, home, and end keys to navigate through the gallery. when you have the desired table style highlighted, press the enter key to select it.keyboard shor tcuts 68 1 part navigating within drop-down lists if you press althfs, which is the equivalent of selecting home, font size, the font size in the drop-down is selected. you can either type a font size and press enter or press the down-arrow key to open the drop-down list. you can then use the down arrow, up arrow, page down, page up, home, and end keys to navigate to a choice in the list. when you have the desired item highlighted, press enter to select that item. backing up one level through a menu suppose that you press alth to access the home tab and then realize you are in the wrong tab. you can press the esc key to move back to display the tooltips for the main menu choices. if you want to clear the tooltips completely, press alt again. dealing with keyboard accelerator confusion if you want to select something on the home tab in figure 5.2, you may be frustrated because you can see the menu choices, but there are no tooltips for most commands. for icons in the top of the ribbon, it appears that the main keytips apply to the menu items. for example, you may think that the h keytip applies to cut. even though you are already on the home tab, you need to press the h key to force excel to show the tooltips for the indi-vidual menu items on the home tab. selecting from legacy dialog boxes some commands lead to legacy dialog boxes like the ones in previous editions of excel. these dialog boxes do not display the excel 2010 keytips. however, most of the dialog boxes do use the convention of having one letter of each command underlined, which is called a hotkey in microsoft parlance. in this case, you can press the underlined letter to select the command. figure 5.4 after opening a gallery, you use the arrow keys to navigate through the gallery and press enter to select a style. note if you find the accelerator tooltips to be confusing and unwieldy, you need to attack them one at a time. find a task that you use regularly, such as sorting the current data set ascending by the selected col-umn. press the alt key. press a for the data tab. notice that a sorts ascending and d sorts descending. these should be easy enough to remember; altaa for sort ascending, and altad for sort descending.69 us ing the shor tcut keys 5 chapter for example, press althvs instead of selecting home, paste, paste special. you are then pre-sented with the paste special dialog box, as shown in figure 5.5. to select values and transpose in this dialog, press v for values and e for transpose, because those are the letters underlined in the dialog. you can then press enter instead of clicking the default ok button. figure 5.5 in a legacy dialog box, type the underlined letters to select options. to watch a video of legacy dialog boxes, search for excel in depth 5 at youtube. using the shortcut keys excel 2010 automatically recognizes all the ctrl shortcut keys that were used in legacy versions of excel. in fact, many of these keys are consistent across all windows applications. table 5.1 lists the common windows ctrl shortcut keys. table 5.1 windows shortcut keys key combination action ctrlc copy. ctrlx cut. ctrlv paste. ctrlz undo. ctrly redo. ctrla select all . ctrls save. ctrlo open.keyboard shor tcuts 70 1 part key combination action ctrlw or ctrlf4 close workbook . ctrln new workbook. ctrlp print. ctrlb bold. ctrlu underline. ctrli italic. ctrlf find. table 5.2 illustrates the shortcut keys that you use to navigate. table 5.2 shortcut keys for navigation shortcut key action ctrlhome move to cell a1. ctrlend move to last cell in the used range of the worksheet. ctrlpage up move to previous worksheet. ctrlpage down move to next worksheet. shiftf11 new worksheet. alttab switch to next program . altshifttab switch to previous program. ctrlesc display windows start menu. ctrlf5 restore window size of current workbook. f6 switch to next pane in a window that has been split. ctrlf6 when more than one workbook is open, switch to the next open workbook window. ctrlshiftf6 switch to the previous workbook window. ctrlf9 minimize the window . ctrlf10 maximize the window. ctrlarrow key move to edge of current region. home move to beginning of row.71 us ing the shor tcut keys 5 chapter shortcut key action ctrlbackspace scroll to display the active cell . f5 display the goto dialog . shiftf5 display the find dialog . shiftf4 find next. ctrl. (period) move to next corner of selected range. table 5.3 shows the shortcut keys you use to select data and cells. table 5.3 shortcut keys for selecting data and cells shortcut key action ctrlspacebar if used outside a table, select entire column. if used inside a table, toggle between selecting the data, data and headers, and the entire column. shiftspacebar select entire row. if inside a table, toggle between selecting the table row and the entire row. shiftbackspace with multiple cells selected, revert selection to only the active cell. ctrlshift* select the current region. ctrl/ select the array containing the active cell. ctrlshifto (letter o) select all cells that contain comments. ctrl\ in a selected row, select the cells that do not match the value in the active cell. ctrlshift in a selected column, select the cells that do not match the value in the active cell. ctrl[ (opening square bracket) select all cells directly referenced by formulas in the selection. ctrlshift{ (opening brace) select all cells directly or indirectly referenced by for-mulas in the selection. ctrl] (closing square bracket) select cells that contain formulas that directly refer-ence the active cell. ctrlshift} (closing brace) select cells that contain formulas that directly or indi-rectly reference the active cell. alt; (semicolon) select the visible cells in the current selection.keyboard shor tcuts 72 1 part table 5.4 shows the shortcut keys you use to extend a selection. table 5.4 shortcut keys for extending selections shortcut key action f8 turn extend mode on or off. in extend mode, ext appears in the status line and the arrow keys extend the selection. shiftf8 add another range of cells to the selection or use the arrow keys to move to the start of the range you want to add. then press f8 and the arrow keys to select the next range. shiftarrow key extend the selection by one cell. ctrlshiftarrow key extend the selection to the last nonblank cell in the same column or row as the active cell. shifthome extend the selection to the beginning of the row. ctrlshifthome extend the selection to the beginning of the worksheet. ctrlshiftend extend the selection to the last used cell on the worksheet in the lower-right corner. shiftpage down extend the selection down one screen. shiftpage up extend the selection up one screen. endshiftarrow key extend the selection to the last nonblank cell in the same column or row as the active cell. endshifthome extend the selection to the last used cell on the worksheet in the lower-right corner. endshiftenter extend the selection to the last cell in the current row. scroll lockshifthome extend the selection to the cell in the upper-left corner of the window. scroll lockshiftend extend the selection to the cell in the lower-right corner of the window. table 5.5 shows the shortcut keys you use for entering, editing, formatting, and calculating data. table 5.5 shortcut keys for data entry, formatting, and calculating data shortcut key action enter complete a cell entry and select the next cell below. altenter start a new line in the same cell.73 us ing the shor tcut keys 5 chapter shortcut key action ctrlenter fill the selected cell range with the current entry. shiftenter complete a cell entry and select the next cell above. tab complete a cell entry and select the next cell to the right. shifttab complete a cell entry and select the previous cell to the left. esc cancel a cell entry. arrow keys move one character up, down, left, or right. home move to the beginning of the line. f4 or ctrly repeat the last action. ctrlshiftf3 create names from row and column labels. ctrld fill down. ctrlr fill to the right. ctrlf3 define a name. ctrlk insert a hyperlink. ctrl; (semicolon) enter the date. ctrlshift: (colon) enter the time. altdown arrow display a drop-down list of the values in the current column of a range. ctrlz undo the last action. (equal sign) start a formula. backspace in the formula bar, delete one character to the left. enter complete a cell entry from the cell or formula bar. ctrlshiftenter enter a formula as an array formula. esc cancel an entry in the cell or formula bar. shiftf3 in a formula, display the insert function dialog box. ctrla when the insertion point is to the right of a function name in a formula, display the function arguments dialog box. ctrlshifta when the insertion point is to the right of a function name in a formula, insert the argument names and parentheses. f3 paste a defined name into a formula.keyboard shor tcuts 74 1 part shortcut key action alt (equal sign) insert an autosum formula with the sum function. ctrlshift” (quotation mark) copy the value from the cell above the active cell into the cell or the formula bar. ctrl’ (apostrophe) copy a formula from the cell above the active cell into the cell or the formula bar. ctrl (backtick) alternate between displaying cell values and displaying formu-las. f9 calculate all worksheets in all open workbooks. when a portion of a formula is selected, calculate the selected portion and then press enter or ctrlshiftenter (for array formulas) to replace the selected portion with the calculated value. shiftf9 calculate the active worksheet. ctrlaltf9 calculate all worksheets in all open workbooks, regardless of whether they have changed since the last calculation. ctrlaltshiftf9 recheck dependent formulas and then calculate all cells in all open workbooks, including cells not marked as needing to be calculated. f2 edit the active cell and position the insertion point at the end of the cell contents. if in-cell editing is turned off, moves the inser-tion point to the formula bar. altenter start a new line in the same cell. backspace edit the active cell and then clear it or delete the preceding character in the active cell as you edit cell contents. delete delete the character to the right of the insertion point or delete the selection. ctrldelete delete text to the end of the line. f7 display the spelling dialog box. shiftf2 edit a cell comment. enter complete a cell entry and select the next cell below. ctrlz undo the last action. ctrlshiftz when the autocorrect smart tag is displayed, undo or redo the last automatic correction. delete clear the contents of the selected cells. ctrl- (hyphen) delete the selected cells.75 us ing the shor tcut keys 5 chapter shortcut key action ctrlshift (plus sign) insert blank cells. alt’ (apostrophe) display the style dialog box. ctrl1 display the format cells dialog box. ctrlshift apply the general number format. ctrlshift apply the currency format with two decimal places (negative numbers in parentheses). ctrlshift% apply the percentage format with no decimal places. ctrlshift apply the exponential number format with two decimal places. ctrlshift# apply the date format with the day, month, and year. ctrlshift@ apply the time format with the hour and minute, and am or pm. ctrlshift! apply the number format with two decimal places, thousands separator, and minus sign (-) for negative values. ctrlb apply or remove bold formatting. ctrli apply or remove italic formatting. ctrlu apply or remove underline. ctrl5 apply or remove strikethrough. ctrl9 hide the selected rows. ctrlshift( (opening parenthesis) unhide any hidden rows within the selection. ctrl0 (zero) hide the selected columns. ctrlshift) (closing parenthesis) unhide any hidden columns within the selection. ctrlshift& apply the outline border to the selected cells. ctrlshift_ (under-score) remove the outline border from the selected cells. there are shortcut keys specifically for using the border tab in the format cells dialog. press ctrl1 to display the format cells dialog. press ctrlpgdn until you arrive at the border tab. then you can use the shortcut keys shown in table 5.6.keyboard shor tcuts 76 1 part table 5.6 shortcut keys for borders shortcut key action altt apply or remove the top border. altb apply or remove the bottom border. altl apply or remove the left border. altr apply or remove the right border. alth if cells in multiple rows are selected, apply or remove the horizontal divider. altv if cells in multiple columns are selected, apply or remove the vertical divider. altd apply or remove the downward diagonal border. altu apply or remove the upward diagonal border. using excel 2003 keyboard accelerators in legacy versions of excel, most menu items included one underlined letter. in those versions, you could hold down the alt key while pressing the underlined letter to invoke the menu item. in the excel 2003 screen shown in figure 5.6, you can display the edit menu by pressing alte, and you can select edit, fill, justify by pressing alteij. figure 5.6 pressing alteij performs edit, fill, justify.77 us ing excel 2003 keyboard accelerators 5 chapter instead of pressing alteij all at once, when the edit menu is displayed, you can display the fill flyout menu by pressing i. then you can perform the justify command by pressing j. if you are a power excel user, you probably have a few of these commands memorized, such as alteij for edit, fill, justify; altesv for edit, paste special, values; and altdl for data validation. if you have some of these commands memorized, when you hear that the menu in excel 2010 is completely gone, you might be worried that you have to relearn all the shortcut keys. however, there is good news for the power excel gurus who have favorite alt shortcut keys burned into their minds—most of them will continue to work as they did in excel 2003. if you were an intermediate excel user who regularly used the excel 2003 keyboard accelerators but had to look at the screen to use them, you should start using the new keyboard accelerators dis-cussed at the beginning of this chapter. invoking an excel 2003 alt shortcut in excel 2003, the main menus were file, edit, view, insert, format, tools, data, window, and help. the keyboard accelerator commands in excel 2003 were altf, alte, altv, alti, alto, altt, altd, altw, and alth. if you are moving from excel 2003 to excel 2010, you will have the best success when trying to access commands on the edit, view, insert, format, tools, and data menus. none of the keyboard accel-erators associated with window or help work in excel 2010. alth takes you to the home tab instead of the few commands on the help menu, and altw takes you to the view tab. some of the keyboard shortcuts associated with the file menu in excel 2003 continue to work in excel 2010. pressing altf opens the file menu. in excel 2003, pressing altfo performs file, open. it happens that o is the shortcut on the file menu for open, so pressing altfo in excel 2010 also performs file, open. for the shortcut keys alte, altv, alti, alto, altt, and altd, excel switches into office 2003 access key mode. in this mode, a tooltip appears over the ribbon, indicating which letters you have typed so far (see figure 5.7). when you have entered enough letters, the command is invoked. if you have forgotten the sequence, you can press esc to exit the excel 2003 access key mode. tip in excel 2007, you had to pause briefly after typing the first letter in the legacy shortcut key sequence. for example, you pressed alte, paused, then pressed s,v to edit, paste special, values. if you did not pause, the second letter was lost because excel displayed the pop-up office key sequence window. this problem has been solved in excel 2010. you no longer have to pause between the first and second letters figure 5.7 the office 2003 access key tooltip shows which keys you have used so far while entering a legacy shortcut.keyboard shor tcuts 78 1 part determining which commands work in legacy mode if you try a command that no longer works in excel 2010, nothing happens. several commands don’t make sense in the framework of excel 2010, so they have been deprecated. table 5.7 lists the legacy keyboard commands and shows which of them continue to work in excel 2010. table 5.7 excel legacy keyboard commands works in shortcut excel 2010? command altfn yes file, new altfo yes file, open altfc yes file, close altfs yes file, save altfa yes file, save as altfg no file, save as web page altfw no file, save workspace altfh no file, file search altfm no file, permission altfe no file, check out altfe no file, check in altfr no file, version history altfb no file, web page preview altfu no file, page setup altfts no file, print area, set print area altftc no file, print area, clear print area altfv no file, print preview altfp yes as altfpp file, print altfdm no file, send to, mail recipient altfds no file, send to, original sender altfdc no file, send to, mail recipient (for review) altfda no file, send to, mail recipient (as attachment)79 us ing excel 2003 keyboard accelerators 5 chapter works in shortcut excel 2010? command altfdr no file, send to, routing recipient altfde no file, send to, exchange folder altfdo no file, send to, online meeting participant altfdx no file, send to, recipient using internet fax ser vice altfi no file, properties altf1 yes file, 1 altf2 yes file, 2 altf3 yes file, 3 altf4 yes file, 4 altf5 yes file, 5 altf6 yes file, 6 altf7 yes file, 7 altf8 yes file, 8 altf9 yes file, 9 altft no file, sign out altfx yes file, exit alteu yes edit, undo alter yes edit, repeat altet yes edit, cut altec yes edit, copy alteb yes edit, office clipboard altep yes edit, paste altes yes edit, paste special alteh no edit, paste as hyperlink alteid yes edit, fill, down alteir yes edit, fill, right alteiu yes edit, fill, upkeyboard shor tcuts 80 1 part works in shortcut excel 2010? command alteil yes edit, fill, left alteia yes edit, fill, across worksheets alteis yes edit, fill, series alteij yes edit, fill, justify alteaa yes edit, clear, all alteaf yes edit, clear, formats alteac yes edit, clear, contents alteam yes edit, clear, comments alted yes edit, delete altel yes edit, delete sheet altem yes edit, move or copy sheet altef yes edit, find altee yes edit, replace alteg yes edit, go to altek yes edit, links alteo no edit, object alteov no edit, object, convert altvn yes view, normal altvp yes view, page break preview altvk no view, task pane altvtc no view, toolbars, customize altvf yes view, formula bar altvs no view, status bar altvh yes view, header and footer altvc yes view, comments altvv yes view, custom views altvu yes view, full screen (caution: use the maximize button to return.)81 us ing excel 2003 keyboard accelerators 5 chapter works in shortcut excel 2010? command altvz yes view, zoom altie yes insert, cells altir yes insert, rows altic yes insert, columns altiw yes insert, worksheet altih yes insert, chart altis yes insert, symbol altib yes insert, page break altia yes insert, reset all page breaks altif yes insert, function altind yes insert, name, define altinp yes insert, name, paste altinc yes insert, name, create altina yes insert, name, apply altinl yes insert, name, label altim yes insert, comment altia yes insert, ink annotations altipc yes insert, picture, clip art altipf yes insert, picture, from file altips yes insert, picture, from scanner or camera altipd yes insert, picture, ink drawing and writing altipa no insert, picture, autoshapes altipw no insert, picture, wordart altipo no insert, picture, organization chart altig no insert, diagram altio yes insert, object altii yes insert, hyperlink altoe yes format, cellskeyboard shor tcuts 82 1 part works in shortcut excel 2010? command altore yes format, row, height altora yes format, row, autofit altorh yes format, row, hide altoru yes format, row, unhide altocw yes format, column, width altoca yes format, column, autofit selection altoch yes format, column, hide altocu yes format, column, unhide altocs yes format, column, standard width altohr yes format, sheet, rename altohh yes format, sheet, hide altohu yes format, sheet, unhide altohb yes format, sheet, background altoht yes format, sheet, tab color altoa no format, autoformat altod yes format, conditional formatting altos yes format, style altts yes tools, spelling alttr yes tools, research alttk yes tools, error checking altthh no tools, speech, speech recognition alttht no tools, speech, show text to speech toolbar alttd yes tools, shared workspace alttb yes tools, share workbook alttth yes tools, track changes, highlight changes alttta yes tools, track changes, accept or reject changes alttw yes tools, compare and merge workbooks83 us ing excel 2003 keyboard accelerators 5 chapter works in shortcut excel 2010? command alttpp yes tools, protection, protect sheet alttpa yes tools, protection, allow users to edit ranges alttpw yes tools, protection, protect workbook alttps yes tools, protection, protect and share workbook alttnm yes tools, online collaboration, meet now alttns yes tools, online collaboration, schedule meeting alttnw yes tools, online collaboration, web discussions alttnn yes tools, online collaboration, end review alttg yes tools, goal seek altte yes tools, scenarios alttut yes tools, formula auditing, trace precedents alttud yes tools, formula auditing, trace dependents alttue yes tools, formula auditing, trace error alttua yes tools, formula auditing, remove all arrows alttuf yes tools, formula auditing, evaluate formula alttuw yes tools, formula auditing, show watch window alttum yes tools, formula auditing, formula auditing mode alttus no tools, formula auditing, show formula auditing toolbar alttv yes tools, solver alttmm yes tools, macro, macros alttmr yes tools, macro, record new macro alttms yes tools, macro, security alttmv yes tools, macro, visual basic editor alttme no tools, macro, microsoft script editor altti yes tools, add-inskeyboard shor tcuts 84 1 part works in shortcut excel 2010? command alttc no tools, com add-ins altta yes tools, autocorrect options alttc no tools, customize altto no tools, options alttd no tools, data analysis altds yes data, sort altdff yes data, filter, autofilter altdfs yes data, filter, show all altdfa yes data, filter, advanced filter altdo yes data, form altdb yes data, subtotals altdl yes data, validation altdt yes data, table altde yes data, text to columns altdn yes data, consolidate altdgh yes data, group and outline, hide detail altdgs yes data, group and outline, show detail altdgg yes data, group and outline, group altdgu yes data, group and outline, ungroup altdga yes data, group and outline, auto outline altdgc yes data, group and outline, clear outline altdge yes data, group and outline, settings altdp yes data, pivottable and pivotchart report altddd yes data, import external data, import data altddw yes data, import external data, new web query altddn yes data, import external data, new database query altdde yes data, import external data,list85 us ing excel 2003 keyboard accelerators 5 chapter works in shortcut excel 2010? command altdid no data, list, discard changes and refresh altdib no data, list, hide border of inactive lists altdxi yes data, xml, import altdxe yes data, xml, export altdxr yes data, xml, refresh xml data altdxx yes data, xml, xml source altdxp yes data, xml, xml map properties altdxq yes data, xml, edit query altdxa yes data, xml, xml expansion packs edit query altdda yes data, import external data, data range properties altddm yes data, import external data, parameters altdic yes data, list, create list altdir yes data, list, resize list altdit yes data, list, total row altdiv yes data, list, convert to range altdip yes data, list, publish list altdil no data, list, view list on server altdiu no data, list, unlink list altdiy no data, list, synchronize altdr yes data, refresh data altwn no window, new window altwa no window, arrange altwb no window, compare side by side with filename altwh no window, hide altwu no window, unhide altws no window, splitkeyboard shor tcuts 86 1 part works in shortcut excel 2010? command altwf no window, freeze panes altw1 no window, 1 altw2 no window, 2 altw3 no window, 3 altw4 no window, 4 altw5 no window, 5 altw6 no window, 6 altw7 no window, 7 altw8 no window, 8 altw9 no window, 9 altwm no window, more windows althh no help, microsoft excel help altho no help, show the office assistant althm no help, microsoft office online althc no help, contact us althl no help, lotus 1-2-3 help althk no help, check for updates althr no help, detect and repair althv no help, activate product althf no help, customer feedback options altha no help, about microsoft office excel some people liked using altfts in excel 2003 for file, print area, set print area. if you are one of those people, you will be unhappy to hear that your favorite shortcut key is not supported in excel 2010. however, most of the powerful and common shortcut keys are still available, so there is a good chance that your knowledge of past shortcut keys will help when you upgrade to excel 2010.6 the excel options dialog in legacy versions of excel, many options were controlled through either tools, options or tools, customize dialog boxes. tools, customize was the important place to turn off the adaptive menus and customize the toolbars. tools, options led to the busiest dialog box in excel; the options dialog has 173 settings on 14 tabs. in addition, the options dialog had six buttons that would take you to further dialogs. it was a challenge to find something specific in the old options dialog. excel 2010 offers a redesigned excel options dialog. microsoft had the following goals for the options dialogs for all the office products: • show the most important settings earlier and more clearly, so they can be found. most of the items you need to change are now in the general, formulas, proofing, and save categories. • move the arcane functions to an advanced category. • add tooltip icons next to many items, so you can understand exactly what those settings do. • make it clear whether a setting affects all workbooks, the current workbook, or the current worksheet. oddly, some of the excel options that were located on the resources tab in excel 2007 have been moved to the backstage view under the excel category. introducing the excel options dialog this section shows you how to display excel options and provides an overview of what you might find on each tab. later sections of this chapter cover each tab in detail. the entry to the excel options dialog is at the bottom left of the excel backstage view. open the file menu and find the options button just above the exit button, as shown in figure 6.1.the excel opt ions dialog 88 1 part instead of tabs across the top, the excel options dialog uses ten categories of options along the left side, as shown in figure 6.2. when you choose a category on the left, the settings for that category appear on the right. table 6.1 shows the general types of settings in each category. table 6.1 excel options dialog box settings category type of settings general the most commonly used settings, such as user interface settings, color schemes, default font for new workbooks, number of sheets in a new workbook, and customer name. formulas all options for controlling calculation, error-checking rules, and for-mula settings. note that options for multithreaded calculations are currently considered obscure enough to be on the advanced tab rather than on the formulas tab. click for excel options here click for resources here items formerly in options, resource figure 6.1 the excel options button is on the backstage view.89 int roduc ing the excel opt ions dialog 6 chapter category type of settings proofing spell check options and a link to the autocorrect dialog. save the default method for saving, autorecovery settings, legacy colors, and web server options. language choose editing language, tooltip language, and help language. advanced all options that microsoft considers arcane, including excel 2003’s editing, display, general, and lotus compatibility settings. customize ribbon icons to customize the ribbon. see chapter 3, “using other excel interface improvements,” for details on using this panel. quick access toolbar icons to customize the quick access toolbar. see chapter 2, “the ribbon interface and quick access toolbar,” for details on using this panel. add-ins a list of available and installed add-ins and smart tags. new add-ins can be installed from the button at the bottom of this category. trust center links to the microsoft trust center, with 11 additional categories. excel 2007 offered a resources category in excel options. in excel 2010, the items in this category have been moved to the backstage view. refer to figure 6.1. figure 6.2 the excel options dialog has 10 categories along the left side.the excel opt ions dialog 90 1 part getting help with a setting many settings appear with a small i icon. if you hover the mouse near this icon, excel displays a super tooltip for the setting. the tooltip explains what happens when you choose the setting. it also provides some tips about what you need to be aware of when you turn on the setting. for example, the tooltip in figure 6.3 shows information about the calculation settings. it also explains that you should use the f9 key to invoke a manual calculation. help icon figure 6.3 exploring new excel 2010 options excel 2010 offers 28 new settings that can be grouped into categories for internationalization, performance, security, and general improvements. some changes you might notice in excel 2010: • the excel 2007 category of popular has been renamed to general. • edit custom lists has been removed from the general category. it is now buried in the display section of the advanced tab. • show developer tab in the ribbon has been removed from the general category. you can now choose to show or hide the developer tab the same as all the other ribbon tabs in the new customize ribbon category. • the customize ribbon tab allows you to rearrange commands on the ribbon. see chapter 3 for details on customizing the ribbon.91 int roduc ing the excel opt ions dialog 6 chapter using autorecover options for many versions, excel periodically saves a version of your work every 10 minutes. if your com-puter crashes, the recovery pane offers to let you open the last autorecovered version of the file. this feature is sure to save you from retyping data that may have otherwise been lost. another painful situation occurs when you do not save changes and then close excel. yes, excel will ask if you want to save changes for each open document, but this question usually pops up at 5:00 p.m. when you are in a hurry to get out of the office. if you are thinking about what you need to do after work and not paying attention to which files are still open, you might click no to the first document and then click no again and again without noticing that the fifth open document was one that should have been saved. another scenario is leaving an excel file open overnight only to discover that windows update decided to restart the computer at 3:00 a.m. after being burned by this a dozen times, you can change the behavior of windows update to stop doing this. however, if windows update closed excel without saving your documents, you can lose those autorecovered documents. a new setting in excel 2010 has excel save the last autorecovered version of each open file when you close without saving. this setting is on the save category of excel options and is called keep the last autosaved version if i close without saving. as soon as you realize that you saved without closing, visit the autorecover file location (usually %appdata%\roaming\microsoft\excel\) to see if your file is there. copy it, rename it, and paste it back to a safe location. another new option in the save category is the location for storing files that you check out from a sharepoint server. in the save checked out files to section, you can specify that files should be saved in an office document cache, as well as the location for the cache. new excel 2010 options for internationalization nine new options impact non-english versions of excel 2010: • brazilian, portuguese, and spanish proofing options have been added to the proofing tab. you can choose pre-reform and post-reform proofing for portugese and a verb form for spanish. • separate language choices for help, tooltips, and display have been added to the language tab. you can now edit in one language but have the tooltips appear in another language. this setting is great for translators and others who must write for the home office in one language but prefer to have help and tooltips appear in their primary language. • right-to-left language support has been added in three places to better support languages such as arabic and hebrew that are read from right-to-left. the display section of the advanced tab offers a choice to set the default direction for all workbooks. you can also override that for a particular worksheet in the display worksheet section of the advanced tab. finally, the editing section of the advanced tab offers a new cursor movement setting to control the cursor in right-to-left environments. • support for the iso 8601 date format appears in the save tab. do not use this setting unless you have a specific need and everyone in your workgroup has excel 2010. if you save in this format and share the workbook with excel 2007, your dates will be lost.the excel opt ions dialog 92 1 part new excel 2010 options for performance many excel features offer a lot of flash and glitter, which is fine if you have a modern computer with multiple gigabytes of ram and a fast display processor. on an older machine, though, the anima-tion to show columns sliding over after inserting a new column might severely hamper performance. the following performance functions are new in excel 2010: • insert charts using draft mode has been added to the chart section of the advanced category. excel 2010 lifts the limit on the number of points that can be shown in a single series. if you show 12 monthly points in an accounting chart, you do not need this setting. if you were a scientist showing a million points in a series, this would be a great setting until you are ready to print the final workbook. if you choose this setting, the chart will be rendered in a draft mode. a drop-down indicator appears in the corner of the chart with options to exit draft mode or to hide the indicator on the chart (figure 6.4). note note that this setting affects all charts that you create. you can use the draft mode drop-down on the chart tools design tab to change draft mode for one or all charts in the current workbook. figure 6.4 a quarter-million point chart in draft mode ren-ders quickly. • hide draft notification on charts has been added to the chart section of the advanced category. this will globally remove the draft mode drop-down from charts rendered in draft mode. • disable hardware graphics acceleration has been added to the display section of the advanced category. although hardware graphics acceleration is designed to make things faster, this is not true for older pcs. the hardware cards in older pentium iii era machines actually slow things down. if you have an older pc, try turning off the hardware graphics acceleration in excel to see if the screen renders faster.93 int roduc ing the excel opt ions dialog 6 chapter • disable undo for large pivottable refresh operations has been added to the general section of the advanced category. this setting is designed to make the refresh time faster for large pivot table data sets. note that by default, the undo is turned off automatically when you have more than 300,000 records in the pivot cache. • a new image size & quality section has been added to the advanced category. most people add a photo to dress up the cover page of a document. however, you probably don’t need an 8 mega-pixel image living in the workbook. by default, excel compresses the image before saving the file. you can control the target output size using the drop-down in excel options. choices include 96 ppi, 150 ppi, and 220 ppi. the 96 ppi will look fine on your display. use 220 ppi for images that you will print. if you want to keep your images at the original size, you can select the new do not compress images in file setting. • discard editing data has been added to the image size & quality section of the advanced cat-egory. this has both file size and privacy considerations. in the left side of figure 6.5, an old picture of my son appears. after cropping the photo, using 96 ppi rendering and discard editing data, the file size was reduced from 445k to 18k. that is a significant file size reduction. more important, it also ensures that only the cropped area of the photo is saved with the document. if you do not have editing data discarded, the next person to receive the spreadsheet can select the image, and click the crop icon in the picture tools, format ribbon tab to see the parts of the photograph that you cropped out. • enable multi-threaded calculation is new in the general section of the advanced category. if excel is your primary application, leave this in the default state. if excel is only a hobby and your computer should be using the other three processors in your quad-core cpu for running another figure 6.5 if you do not discard editing data, someone can later see what you cropped out of the photo.the excel opt ions dialog 94 1 part application, you can disable multithreaded calculation. • allow user-defined xll functions to run on a compute cluster is new in the formulas category. some companies have invested in extending excel by designing new functions in vb. net, c#, or c. the theory is that if you needed a new function, you would have knocked it out in vba. this means that people are forced to use xll add-ins when they need better performance than vba. if you have xll add-ins, you can use this new setting to send xll calculations out to a high perfomance computing server. these computers have anywhere from 16 to 4,000 processors! a microsoft whitepaper at http://download.microsoft.com/ download/f/b/5/fb5a8dee-f313-4a24-ac35-383d71ff5992/hpcserverwhitepaper.pdf describes an insurance risk model where the recalc time is 200 hours on a single processor. additional excel features affecting performance the following options, which were added in excel 2007, also affect performance: • show mini toolbar on selection is in the general category. this feature is popular in word, but it rarely appears in excel. to use it, you have to select a few characters from a cell while the cell is in edit mode. the mini toolbar provides quick access to text formatting tools. • enable live preview is also in the general category. when you hover over a command, the work-sheet previews the change before you click to select the setting. alert the user when a potentially time-consuming operation occurs is in the advanced category. when this option is set, excel alerts you when a potentially time-consuming operation occurs. by default, excel warns you when an operation will affect more than 35,554 cells. you can change the cell threshold or simply turn this feature off. if you are about to do a subtotals command, you need to invoke the command whether it will take a long time. new excel 2010 options for security excel 2010 offers seven new settings for improved security. most of these are located in the trust center. to access the trust center, open excel options, click the trust center category, and then click the trust center settings button. working with trusted document settings legacy versions of excel used the concept of a trusted folder. this means that anything stored in a particular folder will not be subject to macro security and external content settings. excel 2010 has a somewhat uncharacteristic easing of these rules. when you open a file from your local hard drive that has external links or macros, excel displays a security warning in the info bar, as shown in figure 6.6. tip keep in mind that if you choose to enable content, excel will remember that this file is a trusted document, which means you will not be required to click enable content the next time you open the file.95 int roduc ing the excel opt ions dialog 6 chapter the inherent problem here is that if you open a file and discover the macros are bad, you will not want those macros to open automatically the next time. there is no way to untrust a single docu-ment other than deleting, renaming, or moving it. instead, you have to go to the trusted documents category of the trust center where you can choose to reset the list of trusted documents (see figure 6.7). figure 6.6 excel alerts you when you open a file with exter-nal links or a macro. figure 6.7 use the trusted document cat-egory to reset a list of trusted documents. two additional settings in this category modify trusted document behavior. allow documents on a network to be trusted controls whether the trusted document concept extends to files stored on a network. you might choose to trust documents on your local hard drive but not trust documents on a network. alternatively, you might decide that your coworkers are all competent enough never to download malicious macros. in this case, the trusted document paradigm can extend to network drives. finally, you can choose to disable all trusted documents, which will force you to answer the enable question for every document that has macros or external links. microsoft has been uptight about macro security for a long time. for this reason, it was initially per-plexing why microsoft would choose to turn off external links in excel 2007. however, after years of clicking enable repeatedly, it almost seems like the lovable behavior of a paranoid uncle. most of us will actually miss the repeated requests if it is okay to allow external links to work.the excel opt ions dialog 96 1 part controlling other security options excel 2010 offers the following new security options: • enable data execution prevention mode is in the new dep settings category of the trust center. everything on a computer is stored as a series of 1s and 0s. for example, program code is a bunch of 1s and 0s, and spreadsheet data is stored as a series of 1s and 0s. • antivirus programs are vigilant in examining program code for viruses but may not search your spreadsheet data looking for malicious code. for this reason, hackers might use an approach where they embed the bad code in a worksheet data set and then have a virus attempt to run code that is stored in the data. this is pretty advanced stuff—you would have to be a 13-year-old computer geek to understand it. • windows xp sp2 introduced a set of technologies to prevent malicious code from exploiting this vulnerability. most of the time, it makes sense to prevent anything stored in the data area of memory from being executed as program code. some programs legitimately need to run code from the data area. leave this setting turned on unless you have a specific program that has a dep compatibility problem. • show customer submitted office live content is a new setting in the general section of the advanced tab. other people can share their general-purpose templates on office live. do not select this check box to prevent those documents from being available to you. • choose to block or open certain file types in protected view is in the file block settings of the trust center. excel 2010 can open many types of files, going back to excel 2 macrosheets. excel 2010 is more secure than these old files. however, if you want to prevent the older files from opening, or perhaps you want to open them in protected mode where you can see the data but not allow any macros to run, you can adjust these settings in the new file block settings (see figure 6.8). figure 6.8 you can choose to open certain file types in protected view to prevent malicious code from running.97 ten opt ions to cons ider 6 chapter • automatically detect installed office applications to improve office online search results is in the privacy section of the trust center. many help topics are coming from office online. these topics might refer you to help topics about outlook or powerpoint. with this new setting, you computer will tune the office online results to not offer topics about outlook if you do not have outlook installed. this seems like a good thing, but if you do not want microsoft to know which applications you have installed, you can prevent this information from being sent. • allow research task pane to check for and install new services is new in the privacy section of the trust center. the research pane can be displayed using review—research. it offers stock quotes, a dictionary, and a thesaurus. by selecting this check box, microsoft will make additional research tools available to you as they are released. ten options to consider although hundreds of excel options exist, this section provides a quick review of some options that might be helpful to you. • update your name in the general category. the name stored on this tab is used in cell comments and in the document properties. • save file in this format in the save category. if you regularly create macros, choose the excel macro-enabled workbook as the default format type. • update your default file location in the save tab. excel always wants to save new documents in your my documents folder. however, if you always work in the c:\accountingfiles\ folder, update the default folder to match your preferred location. • after pressing enter, move selection direction is the first setting in the advanced category. if you regularly perform data entry and prefer that the cell pointer move across the spreadsheet, change this setting from down to right. • show this number of recent documents has been enhanced dramatically since excel 2003. whereas legacy versions of excel showed up to nine recent documents at the bottom of the file menu, excel 2010 allows you to see up to 50 recent documents in the recent category of the file menu. this setting can be increased up to 50 by visiting the display section of the advanced category. • edit custom lists has been moved to the display section of the advanced category. custom lists add functionality to the fill handle, allow custom sort orders, and control how fields are displayed in the label area of a pivot table. type a list in the correct sequence in a worksheet. edit custom lists and click import. excel can now automatically extend items from that list, the same as it can extend january into february, march, and so on. • make excel look less like excel by hiding interface elements in the three display sections of the advanced category. you can turn off the formula bar, scrollbars, sheet tabs, row and column headers, and gridlines. you can customize the ribbon to remove all main tabs except the file menu. the point is that if you design a model to be used by someone who never uses excel, the person can open the model, plug in a few numbers, and get the result without having to see the entire excel interface. figure 6.9 shows an excel screen with all elements removed.the excel opt ions dialog 98 1 part • show zero in cells that have zero value is in the display options for this worksheet section of the advanced category. occasionally people want zeros to be displayed as blanks. although a custom number format of 0;-0;; will do this, you can change the setting globally by clearing this option. • group dates in the autofilter menu is in the display options for this workbook section of the advanced category. starting with excel 2007, date columns show a hierarchical view of years, months, days in the autofilter drop-down. if you like the old behavior of showing each individual date, turn off this setting. • add a folder on your local hard drive as a trusted location. files stored in a trusted location automatically have macros enabled and external links updated. if you can trust that you will not write malicious code, then define a folder on your hard drive as a trusted location. from excel options, select the trust center category and then trust center settings. in the trust center, select trusted locations, add new location. five excel oddities you may rarely need any of the features presented in this section. however, in the right circum-stance, they can be timesavers. • adjust the gridline color in the display section of the advanced category. if you are tired of gray gridlines, you can get a new outlook with bright red gridlines. figure 6.9 hide ele-ments of the excel interface to create a clean look for your model.99 guide to excel opt ions 6 chapter • allow negative time by switching to the 1904 date system in the general section of the advanced category. excel never allows a time to return a negative time. however, if you are tracking comp time and you allow people to borrow against future comp time, it might be nice to allow negative time. in this case, switch to the 1904 date system to have up to 3 years of negative time. • put an end to the green triangles on your account numbers stored as text. most of the green triangle indicators are useful. however, if you have a column of text account numbers where most values are numbers, seeing thousands of green triangles can be annoying. in addition, the green triangles can hide other, more serious, problems. clear the numbers formatted as text or preceded by an apostrophe in the error checking rules check box in the formulas category. • automatically insert a decimal point replicates the antique adding machines that were office fix-tures in the 1970s. when working with a manual adding machine, it was frustrating to type deci-mal points. you could type 123456 and the adding machine would interpret the entry as 1,234.56. if you find that you are doing massive data entry of numbers in dollars and cents, you can have excel replicate the old adding machine functionality. after enabling this setting, you can indicate how many digits of the number should be interpreted as after the decimal point. the only hassle is that you need to enter 5 as “500”. the old adding machines actually had a 00 key, but those are long since gone. • change dwight to diapers using autocorrect options. if you are a fan of the nbc sitcom “the office”, you might remember the 2007 episode in which jim allegedly put a macro on dwight’s computer that automatically changed the typed word “dwight” to “diapers.” however, this doesn’t require a macro. from excel options, choose the proofing category, and then click the autocorrect options button. on the autocorrect tab, you can type new correction pairs. in this example, you would type dwight into the replace box and diapers into the with box. the next time someone types dwight and then a space, the word will automatically change to diapers. you can also remove correction pairs by selecting the pairs and then pressing delete. for example, if you hate that microsoft converts (c) to , you can delete that entry from the list. to see a video demo of this trick, search for excel in depth 6 at youtube. guide to excel options table 6.2 shows every excel feature that you can change in the excel options dialog. the table shows the feature, the category where it can be modified, the section within the category, and the text of the option. note note that unlike excel 2003, the status bar can no longer be removed in the excel 2010 interface.the excel opt ions dialog 100 1 part table 6.2 excel options by feature feature category section text mini toolbar general user interface options show mini toolbar on selection live preview general user interface options enable live preview skin general user interface options color scheme screentips general user interface options screentip style font general new workbooks use this font font size general new workbooks font size view general new workbooks default view for new sheets worksheets general new workbooks include this many sheets name general personalize user name calculation formulas calculation options workbook calculation automatic calculation formulas calculation options workbook calculation automatic except for data tables calculation formulas calculation options calculate workbook manual calculation formulas calculation options recalculate workbook before saving circular reference formulas calculation options enable iterative calculation formulas circular reference formulas calculation options maximum iterations formulas circular reference formulas calculation options maximum change formulas hpc cluster formulas calculation options allow xll functions to run on a compute cluster r1c1 style formulas working with formulas r1c1 reference style autocomplete formulas working with formulas formula autocomplete formula style formulas working with formulas use table names in formulas getpivotdata formulas working with formulas use getpivotdata functions for pivottable references101 guide to excel opt ions 6 chapter feature category section text error checking formulas error checking enable background error checking error indicators formulas error checking indicate errors using this color errors ignored formulas error checking reset ignored errors errors in formulas formulas error checking rules error checking rules—cells containing formulas that result in an error formulas inconsistent in table formulas error checking rules error checking rules— inconsistent calculated column formulas in tables dates as text formulas error checking rules error checking rules—cells containing years represented as two digits numbers as text formulas error checking rules error checking rules— numbers formatted as text or preceded by an apostrophe formulas—rules inconsistent formulas error checking rules error checking rules— formulas inconsistent with other formulas in the region formulas—omitted cells formulas error checking rules error checking rules— formulas which omit cells in a region protected sheet with unlocked formulas formulas error checking rules error checking rules— unlocked cells containing formulas empty cells in formula formulas error checking rules error checking rules— formulas referring to empty cells table with invalid data formulas error checking rules error checking rules—data entered in a table is invalid autocorrect proofing autocorrect options autocorrect options spelling proofing when correcting spelling spelling—ignore words in uppercase spelling proofing when correcting spelling—ignore words that contain numbersthe excel opt ions dialog 102 1 part feature category section text spelling proofing when correcting spelling spelling—ignore internet and file addresses spelling proofing when correcting spelling spelling—flag repeated words spelling proofing when correcting spelling spelling—enforce accented uppercase in french spelling proofing when correcting spelling spelling—suggest from main dictionary only spelling proofing when correcting spelling spelling—custom dictionaries spelling proofing when correcting spelling french modes spelling proofing when correcting spelling spanish modes spelling proofing when correcting spelling portuguese modes spelling proofing when correcting spelling brazilian modes spelling proofing when correcting spelling dictionary language file format save save workbooks save files in this format autorecover save save workbooks save autorecover information every n minutes autorecover save save workbooks keep the last auto recovered file if i close without saving autorecover save save workbooks autorecover file location location folder for files save save workbooks default file location iso dates save save workbooks save date and time values using iso 8601 date format (may limit precision) autorecover—disable save autorecover exceptions disable autorecover for this workbook only103 guide to excel opt ions 6 chapter feature category section text server save offline editing options for document management server files save checked-out files to the server drafts location files on this computer server save offline editing options for document management server files save checked-out files to the web server server save offline editing options server drafts location or office document cache for document management server files color save preserve visual appearance choose what colors will be seen in legacy versions of excel languages language preferences choose editing languages help language language preferences choose display and help language screentip language language preferences choose screentip language direction after enter advanced editing after pressing enter, move selection direction decimal places advanced editing automatically insert a decimal point with n places fill handle advanced editing enable fill handle and cell drag and drop override alert advanced editing alert before overwriting cells with drag and drop in-cell editing advanced editing allow editing directly in cell extend ranges advanced editing extend data range formats and formulas percentages advanced editing enable automatic percent entry autocomplete advanced editing enable autocomplete for cell values mouse wheel advanced editing zoom on roll with intellimousethe excel opt ions dialog 104 1 part feature category section text time consuming operation advanced editing alert the user when a potentially time consuming operation occurs time consuming operation advanced editing time consuming is when this number of cells (in thousands) is affected numeric separators advanced editing use system separators numeric separators advanced editing decimal separator numeric separators advanced editing thousands separator cursor movement advanced editing cursor movement logical or visual paste advanced cut, copy, and paste show paste options buttons insert advanced cut, copy, and paste show insert options buttons images advanced cut, copy, and paste cut, copy, and sort inserted objects with their parent cells cropped images advanced image size and quality discard editing data image compression advanced image size and quality do not compress images in file image compression advanced image size and quality set default target output to n ppi chart elements advanced chart show chart element names on hover chart data points advanced chart show data point values on hover draft mode advanced chart insert charts using draft mode draft indicator advanced chart hide draft mode notification on charts recent file list advanced display show this number of recent documents ruler advanced display ruler units taskbar advanced display show all windows in the taskbar formula bar advanced display show formula bar105 guide to excel opt ions 6 chapter feature category section text screentips advanced display show function screentips acceleration advanced display disable hardware graphics acceleration comment display advanced display for cells with comments, show no comments or indicators comment display advanced display for cells with comments, show indicators only, and comments on hover comment display advanced display for cells with comments, show comment and indicators right-to-left advanced display default direction custom lists general top options create lists for use in sorts and fill sequences scrollbars advanced display—workbook show horizontal scroll bar (affects one workbook only) scrollbars advanced display—workbook show vertical scroll bar (workbook only) tabs advanced display—workbook show sheet tabs (workbook only) dates in autofilter advanced display—workbook group dates in the autofilter menu (workbook only) images, hide advanced display—workbook for objects, show or hide (workbook only) column & row headers advanced display—worksheets show row and column headers (affects one work-sheet only) formulas, show advanced display—worksheets show formulas in cells instead of their calculated results (worksheet) right-to-left (sheet) advanced display—worksheets show sheet right-to-left page breaks advanced display—worksheets show page breaks (worksheet) zero, display advanced display—worksheets show a zero in cells that have zero value (worksheet)the excel opt ions dialog 106 1 part feature category section text outline symbols advanced display—worksheets show outline symbols if an outline is applied (worksheet) gridlines advanced display—worksheets show gridlines (worksheet) gridline color advanced display—worksheets gridline color (worksheet) multi-threaded calculation advanced general enable multi-threaded calculation multi-threaded calculation advanced threads formulas number of calculation threads links, update advanced when calculating workbook update links to other documents precision as displayed advanced when calculating workbook set precision as displayed dates, 1904 system advanced when calculating workbook use 1904 date system external link values advanced when calculating workbook save external link values sound feedback advanced general provide feedback with sound animation feedback advanced general provide feedback with animation dde advanced general ignore other applications that use dde links, update advanced general ask to update automatic links add-in errors advanced general show add-in user interface errors paper size advanced general scale content for a4 or 8.5”11” paper sizes office live advanced general show customer submitted office live content start-up folder advanced general at start-up, open all files in web options advanced general web options processing advanced general enable multi-threaded processing107 guide to excel opt ions 6 chapter feature category section text pivot undo advanced general disable undo for large pivottable refresh operations (over n thousand records) service options advanced general service options menu access key advanced lotus compatibility microsoft office excel menu key navigation keys advanced lotus compatibility transition navigation keys lotus formula evaluation advanced lotus for worksheet lotus transition formula evaluation lotus formula advanced lotus for worksheet lotus entry transition formula entry ribbon customize ribbon n/a add ribbon customize ribbon n/a remove ribbon customize ribbon n/a new tab ribbon customize ribbon n/a new group ribbon customize ribbon n/a rename ribbon customize ribbon n/a restore defaults ribbon customize ribbon n/a import/export qat quick access toolbar n/a add qat quick access toolbar n/a remove qat quick access toolbar n/a new tab qat quick access toolbar n/a new group qat quick access toolbar n/a renamethe excel opt ions dialog 108 1 part feature category section text qat quick access toolbar n/a restore defaults qat quick access toolbar n/a import/export qat location quick access toolbar n/a show qat below ribbon add-ins add-ins n/a manage excel add-ins trust trust center n/a trust center settings7 the big grid and file formats word leaked out in 2004 that microsoft would be increasing the number of rows in the next version of excel. although no one new exactly how many rows, one thing was certain—the excel file format would have to change. the old xls file format was designed around cell addressing that would fit in a 2 16 address space, hence the limit of 65,536 rows. excel grid limits the new grid in excel offers 1,048,576 rows—that is, 2 20 rows—a sixteen-fold increase from 65,536 rows in excel 2003. it offers 16,384 columns—that is, 2 14 columns—an increase from 256 columns in excel 2003. overall, the new grid pro-vides for 17.1 billion cells on each worksheet. you can now analyze more complex data sets. for example, if you regularly ana-lyze 2,000 items, you can analyze 2.5 years of monthly data in one excel 2003 worksheet. in excel 2010, you can analyze 10 years of weekly data or 43 years of monthly data. columnwise, legacy versions of excel could handle only 9 months of daily data going across the worksheet. excel 2010 can handle 45 years of daily dates or 63 years of weekdays. it is interesting to compare the size increase in the history of spreadsheets. you will see that the size increase is unprecedented. here is a brief history of spreadsheets: • in october 1979, visicalc debuted with 255 rows and 63 columns. • in 1983, lotus 1-2-3 debuted with 8,192 rows and 256 columns. the 2 million cells per worksheet in this version was a 13,000% increase over visicalc. • in 1987, early versions of excel offered 16,384 rows by 255 columns. this 4 million cells was double the amount offered in lotus 1-2-3 release 2.2. • in excel 97, microsoft increased excel to offer 65,536 rows by 255 columns. this 16.7 million cells per spreadsheet was quadruple the previous limit.the big gr id and fi le formats 110 1 part • in excel 2007, the new grid size is 1,048,576 rows by 16,384 columns. this is 17.1 billion cells, which is a 102,300% increase over the previous limit. why are there only 65,536 rows in my excel 2007 spreadsheet? when you initially install excel 2010 and open one of your large workbooks, you might be disappointed to find only 65,536 rows in the worksheet. when this occurs, it means you are in compatibility mode. any time that you open an .xls format file, you are limited to the 65,536 rows that were available in legacy versions of excel. note that the title bar of figure 7.1 indicates that you are in compatibility mode. to leave compatibility mode, select file, and then select convert, as shown in figure 7.2. excel tells you that the old file will be deleted and replaced by a new file. after the file conversion is done, excel offers to close and reopen the file so that you are no longer in compatibility mode. figure 7.1 what is all the hype? there are still only 65,536 rows. figure 7.2 you select convert to leave compatibility mode. with the bigger grid, it is far more likely that you will encounter larger files, formulas, and pivot tables. with a 102,300% increase in the file, many of the old limits in excel 2003 no longer make111 other limi ts in excel 2010 7 chapter sense. because of the bigger grid, microsoft provided relaxed limits in many areas. limits are dis-cussed in the next section. there is also an unusual quirk with the big grid. previously, columns were labeled from a to iv. now, columns are labeled from a to xfd. this means that many three-letter words are now valid column names. in legacy versions of excel, range names such as roi2011 or tax2008 would have been legal names. now that these are actual cell addresses, those names can no longer be used in excel 2010. excel automatically changes such names during the conversion process. for example, a name such as ytd2012 will change to _ytd2012. in addition, you don’t have to worry about updating most formulas, because any formulas that refer-ence the old name will change. however, if you have any vba macros that refer to the old name, or any formulas that included the old name in double quotes in the indirect function, you will have to manually fix those. other limits in excel 2010 in addition to the grid size, a number of other aspects of excel have new limits. table 7.1 illustrates these new limits. table 7.1 excel 2010 limits item old limit new limit pc memory excel can use 1gb max allowed by windows number of unique colors in a single workbook 56 4.3 billion number of conditional format conditions on a cell 3 limited by avail-able memory number of levels of sorting 3 64 number of items in the autofilter drop-down 1,000 10,000 number of characters that can display in one cell 1,024 32,768 number of characters in a cell that excel can print 1,024 32,768 number of unique cell styles in a workbook 4,096 65,536 maximum length of formulas 1,024 characters 8,192 characters number of levels of nesting in formulas 7 64 maximum number of arguments in a function 30 255 number of characters that can be displayed 255 32k in a cell for-matted as text number of items that can be found with find all 65,536 2 billionthe big gr id and fi le formats 112 1 part item old limit new limit number of columns allowed in a pivot table 255 16,384 number of unique items in a single pivot field 32,768 1 million number of fields in a pivot table 255 16,384 number of cells that can depend on a single area before excel must do full calculations instead of partial calcula-tions (because it can no longer track the dependencies required to do partial calculations) 8,102 limited by avail-able memory number of different areas in a sheet that can have dependencies before excel must do full calculations instead of partial calculations (because it can no longer track the dependencies required to do partial calcula-tions) 65,536 limited by avail-able memory number of array formulas in a worksheet that can refer to another (given) worksheet 65,536 limited by avail-able memory number of categories that custom functions can be grouped into 32 255 number of characters that can be updated in a nonresi-dent external workbook reference 255 32,768 number of rows of a column or columns that can be referred to in an array formula (full-column references allowed) 65,335 limitation removed as you can see in table 7.1, excel 2010 includes some excellent improvements. it also has some improvements that allow people to build worse spreadsheets. many people try to rely on nested if functions when they should instead learn about vlookup. increasing from 7 to 64 nested functions allows people to put off learning about vlookup for even longer. if you’ve been avoiding vlookup, you can read about it in chapter 12 , “using powerful functions: logical, lookup, and database functions.” with legacy versions of excel, any pivot table that relied on daily dates almost always had to be built with the dates going down the side instead of across the rows. this was annoying, especially if you planned on rolling the dates up to months or quarters that would eventually fit in the 256 col-umns. the number of excel formats was a problem that was rarely encountered but that caused horrible frustration when it was hit. now the limit will be hit much less frequently.113 tips for navigat ing the big gr id 7 chapter even with these new limits, some areas could still be improved. for example, there is still a limit of eight levels of indentation in outlining. however, for the most part, the new limits are incredible and allow much larger analyses to happen in excel instead of elsewhere. tips for navigating the big grid the navigation tips described in the following sections are not new to excel 2010. however, with 17 billion cells, there is a better chance that you don’t want to be scrolling around with the page up and page down keys. using shortcut keys to move around a variety of shortcuts enable you to quickly move around a worksheet: shortcut what it does ctrlhome move to cell a1 home move to column a of the current row ctrlend move to the last used cell in a worksheet ctrlany arrow key jump to the end of a contiguous range ctrlup arrow move to the first row in the data if your data has no blank cells ctrldown arrow move t o the last row in the data if your data has no blank cells using the end key to navigate the end key is one of the six keys above the arrow keys on a standard keyboard. when you press the end key, an indicator lights up in the status bar of the excel window. when excel is in end mode, you can press an arrow key or the home key. pressing an arrow key takes you to the edge of a contiguous range of cells. pressing home while in end mode will take you to the last used cell in the worksheet. in figure 7.3, pressing end and then the down arrow key causes the cell pointer to jump from c30 to c36. when the cell pointer is on the edge of a range, pressing end followed by the down arrow again causes excel to jump over a range of blank cells and land on the starting edge of the next range. for example, pressing enddown arrow from c36 causes the cell pointer to jump to c39. press end and then right arrow from c39 to jump the gap and land in f39. if you press the end key to move right or down from the last cell that contains data, the cell pointer jumps to the last row or column in the spreadsheet. in a blank worksheet, you can press enddown arrow and endright arrow to move to xfd1048576. to see a demo of using the end key to navigate, search for excel in depth 7 at youtube.the big gr id and fi le formats 114 1 part using the current range to navigate if your data has many blank cells, using ctrlarrow keys or ctrlend key will lead to frustration. you can press ctrl* to select the current region. a current region starts from the current nonblank cell and extends out in all directions until excel encounters a completely blank row, a completely blank column, or the edge of the spreadsheet. then you can press ctrl. (that is, ctrl plus the period key) to move the active cell to each corner of the selection. from the top-left cell of a region, you can press ctrl* and then press ctrl. twice to go to the last used cell in the current region. using go to to navigate you can press the f5 key to display the go to dialog. then you can type a cell address and press ok to quickly jump to that cell. you can also use the name box the same way you use the go to dialog. the name box is the drop-down area immediately to the left of the formula bar. you click in the name box, type a valid cell address, and press enter. excel then jumps to that cell. understanding the new file formats excel 2007 and excel 2010 offer three new file formats, which are discussed in this section. later, the section on file compatibility discusses how you can continue to share files with people using legacy versions of excel. figure 7.3 endarrow key will cause excel to jump over a range of blank cells or a contiguous range of cells.115 understanding the new fi le formats 7 chapter a brief history of file formats excel has traditionally stored workbooks in binary interchange file format (biff). the biff specifi-cation has changed occasionally over time. in 1993, when excel expanded to 16,384 rows, microsoft began using biff5 format. in 1993, most companies did not have corpo-rate local area networks (lans); a file format conversion there-fore usually affected just one person on one computer. if you had upgraded from excel 4 to excel 5, as long as you had a way to convert your excel 4 files to excel’s new biff5 format, everything was fine. in 1997, microsoft introduced a major file change, biff8. this version of biff allowed 65,536 rows. the rise of the internet and email meant that far more people were now sharing files. excel 97 offered a way to save files in the old format in case you needed to share files with a person using legacy versions of excel. all biff versions are proprietary formats. figure 7.4 shows a simple excel 2003 spreadsheet and the corresponding biff, as viewed in notepad. you would certainly never be able to open a notepad window and begin typing a new spreadsheet. similarly, it would be very difficult for other applica-tions to extract data from the biff format. note although you will generally be saving files in one of the .xlsb, .xlssx, and .xlsm formats, there are other new file formats. the .xlam format is used by developers to distribute add-ins. the .xlst is a template format. figure 7.4 biff files are difficult for other applica-tions to read. in excel 2000, microsoft flirted with a new html file format. by default, files were stored as xls files in biff8 format. however, you could save a file as an html file and later open that html file in excel 2000. with some limitations, most contents of the file and formatting could be successfully round-tripped from excel to html and back to excel. this produced an interesting new paradigm: it would be possible for any program that could read or write text files to extract data from the excel html file. a program other than excel could easily read or produce this format.the big gr id and fi le formats 116 1 part using html made sense in 1998–2000. the rise of the internet made html a very popular format. however, although html is a great language for the display of information, it is not necessarily a smart language. in 1998, the world wide web consortium published the first 1.0 specification for a new language called extensible markup language (xml), which presents data that any platform or application can read. like html, xml is a simple text file that can be read or created with notepad. excel 2002 offered a way to export data in xml. excel 2003 continued to use biff8 as the standard file format, but you can choose to save a workbook in xml format. when you later opened the xml file in excel, all the formulas and formatting would be successfully round-tripped. xml in excel 2003 did not support vba or charts. there are a number of advantages to xml. because an xml file is a simple text file, any program can easily read data from it. this file format is also less prone to corruption than biff. if you ran-domly wipe out several bytes of a biff file, it is likely that the file will be corrupt and no longer open in excel. if you truncate or corrupt several bytes of an xml file, the rest of the data is still readable in excel. excel 2010 offers three official file formats—biff12, xlsx, and xlsm—described in the following sections. in addition, excel offers support for biff8 and even biff5, in case you have files floating around from excel 95. using the new binary file format: biff12 with excel 2007’s increase in rows and columns, biff8 would no longer work. excel 2010 can save files in a new binary file format known as biff12. files stored in biff12 have an .xlsb file extension. the save as dialog box calls this type of file excel binary workbook. for the first time, the binary workbook is not the default method for saving in excel. biff12 suffers from the same problems as all previous biff ver-sions: it is difficult for other applications to read from or write to biff formats, and if parts of the biff12 file become corrupted or truncated, excel has a difficult time successfully loading the file. using the new xml file formats: xlsx and xlsm xml in excel 2003 was almost an ideal solution: files could be round-tripped from excel to xml and back to excel, provided that the files did not include vba macros, charts, or other embedded images. excel now offers complete 100% support for every feature in the new xml file formats. workbooks can contain charts, tables, wordart, smartart, shapes, and images. for security purposes, excel supports xml file formats that are macro free and file formats that are macro enabled. these are the two xml file formats that excel 2010 supports: • xlsx—files stored with the .xlsx extension are the default file type in excel 2010. this xml file format does not allow macros. note if you are extremely concerned with performance issues, you might want to use biff12 because a large biff12 file loads more quickly and saves more quickly than the new xml formats. however, if you are con-cerned with file size issues, the new xml file formats will win.117 understanding the new fi le formats 7 chapter • xlsm—files stored with the .xlsm extension are xml files that allow for the inclusion of vba macros. the new xlsx and xlsm file formats are actually zip files, which makes it easy to look inside the file formats. figure 7.5 shows a worksheet that has a number of elements. it has a pivot table in row 21, wordart, smartart, sparklines, slicers, and a chart. figure 7.5 this workbook is saved in xlsm, one of the new xml formats. to look inside any excel 2010 document, follow these steps: 1. in windows explorer, right-click the document name, which is in the format filename.xlsm, and select rename. 2. change the file extension to .zip. windows warns you that if you change a file extension, the file may become unusable. 3. click yes to confirm the change. 4. open the zip file with winzip or any other zip utility. as shown in figure 7.6, inside the zip file, you can see several xml components. the embedded image is included in the zip file. all the settings and styles, drawings, and data are stored as sepa-rate xml files within the zip file. unzipped, these components would take up 115kb. because they are zipped files, the data is stored in 40kb.the big gr id and fi le formats 118 1 part version compatibility with not just one new file format but three file formats in excel 2010, you will face a number of problems as you try to share files with people who use legacy versions of excel. if you are using excel 2010 and want to open a file created in excel 5 through excel 2003, your copy of excel will gladly open the file, but the file will be in a special compatibility mode. in this mode, you cannot use more than 65,536 rows, and you cannot use more than 256 columns. when you attempt to save the file, the compatibility checker will tell you what functionality will be lost. if you start with a new spreadsheet in excel 2010 and want to share the file with someone using excel 5, 95, 97, or 2000, you have to use the compatibility checker and save the file in a legacy ver-sion of excel. if you start with a new spreadsheet in excel 2010 and want to share the file with someone using excel 2002 or excel 2003, you can encourage that person to download the compatibility pack for the microsoft office system. this converter allows excel 2002 and excel 2003 to open files stored in the new xlsb, xlsm, and xlsx formats. however, before you send the file, you should run it through the compatibility checker. to do so, follow these steps: 1. with the file open, go to the file menu. excel opens in backstage view. 2. along the left navigation, select info. 3. in the middle of the center pane, open the check for issues drop-down. 4. select check compatibility. excel displays the compatibility checker, as shown in figure 7.7. the new select versions to show button allows you to filter the list to problems that would affect excel 2007 or problems that would affect only excel 2002–2003. figure 7.6 the components of the workbook are stored as xml files, zipped, and then renamed with an .xlsx or .xlsm extension.119 vers ion compat ibi l i ty 7 chapter figure 7.7 the compatibility checker reports on any compatibility issues with legacy versions of excel. if you know this is a workbook that will always be opened by someone using excel 2002–2007, you can select the check compatibility when saving this workbook check box to make sure you have not introduced any incompatibilities. you will see two types of problems reported in the compatibility checker: • minor loss of fidelity—these issues involve formatting. if you used a color beyond the pallet of 56 colors, it will be reported as a minor loss of fidelity. these types of issues are not considered serious. the person using excel 2002, excel 2003, or excel 2007 will be allowed to open the file, edit the file, and save the file. • significant loss of functionality—this means that you have used a feature in excel 2010 that will no longer function in legacy versions of excel. when you have these types of issues, the file will open in legacy versions of excel, but it will be forced into a read-only state. when you have an issue reported as a significant loss of functionality, you can click the find hyperlink. excel closes the compatibility checker and takes you to the area of the worksheet that contains the problem. if you use the method, you will constantly have to keep returning to the compatibility checker to test other problems. if many problems exist, click the copy to new sheet button. excel inserts a new compatibility report worksheet in the workbook. each issue is listed, along with a count of the number of issues and hyperlinks to help you find the problem (see figure 7.8).the big gr id and fi le formats 120 1 part opening excel 2010 files in excel 2002 or 2003 if you attempt to open an excel 2010 file in excel 2002 or excel 2003, you see a message that the file was created in a newer ver-sion of excel. excel allows you to download a converter so that you can open the file. the download process happens as part of the file open process. it is quick. you have to do it only once per computer. most people will not even remember downloading the utility. after you install the compatibility pack, you can directly open xlsm, xlsx, and xlsb files in excel 2003 or excel 2002. you can also save your files back into the new format. minor loss of fidelity suppose that you have a simple data worksheet in excel 2010. you might have used a few custom colors or perhaps cell styles. when you open that file in excel 2003, you will receive a note that the custom color was converted to the closest color in the standard excel 2003 pallet of 56 colors. other than the formatting change, you can edit and then save the file back to an excel 2010 format. figure 7.8 if you have many signifi-cant issues, it will be easier to copy the con-tents of the compatibility checker to a new work-sheet. note keep in mind that you need to get the person to pay attention to the question that pops up when they first open your file and choose to download the converter. with the malware situ-ation today, many people will automatically choose to down-load nothing. for this reason, a little hand-holding over the phone as the person opens your file for the first time might be appropriate.121 creat ing excel 2010 fi le formats in excel 2003 7 chapter significant loss of functionality the excel 2010 workbook shown previously in figure 7.5 contain several elements that would report as a significant loss of functionality in excel 2003. when you attempt to open this workbook in excel 2003, excel warns you that there is significant loss of functionality. because of the loss of functionality, excel forces the document to be opened as read-only. you have to use file, save as to save the file with a new name. this prevents the original document from losing the incompatible features. excel 2003 does its best to deal with the incompatible features. figure 7.9 shows the file opened in excel 2003. the wordart changes to boring words. the smartart keeps the same basic look and feel, although it is now a static shape. the slicer and the icon sets are lost. any cells using new functions such as sumifs calculate with a #name error. any cells beyond row 65536 are lost. figure 7.9 excel does its best to render certain features in excel 2003. creating excel 2010 file formats in excel 2003 after you install the office compatibility pack on a machine running excel 2002 or excel 2003, you can save files from excel 2003 or excel 2002 in one of the new excel 2007–2010 file formats. the advantage is that when the file is later opened on a machine running excel 2010, the file will not be forced into the compatibility mode. after you have the compatibility pack installed, follow these steps to save files in an excel 2007– 2010 format:the big gr id and fi le formats 122 1 part 1. open or create a workbook in excel 2002 or excel 2003. 2. select file, save as. the save as dialog box appears. 3. open the save as type drop-down and scroll to the bottom of the list. choose one of the excel “12” file types. 4. type a filename. 5. click save. excel runs a converter to convert the workbook to the chosen excel 2010 format. opening excel 2010 files in excel 2007 although both excel 2007 and excel 2010 support the same file for-mats, some new excel 2010 features will not work in excel 2007. figure 7.10 shows the same workbook opened in excel 2007. there is no warning that you have lost any functionality and the workbook is not forced into compatibility mode. however, the slicers are missing. the icon set happened to be using one of the new excel 2010 icon styles, so the icons are missing. if you save that file in excel 2007 and open it in excel 2010, the icon set and the slicer are lost. you will have to refresh the pivot table to calculate the pivot table without the slicer. caution although excel 2007 and 2010 seem fairly similar, microsoft seems to have a freer attitude about the incompatibilities between excel 2007 and excel 2010. because the files are not displaying warnings when opened in excel 2007, there will probably be more serious prob-lems with people unknowingly losing features when opening their excel 2010 files on their home pc with excel 2007. figure 7.10 you are not forced into read-only mode, but some features are lost in excel 2007.8 understanding formulas excel’s forte is performing calculations. when you use excel, you typically use a combination of cells with numbers and cells with formulas. after you design a spreadsheet to calculate something, you can change the numbers used in the assumption cells and then watch excel instantly calculate new results. getting the most from this chapter do not skip this entire chapter; one particular trick in this chapter can save you daily frustration. i regularly entertain accountants and auditors with my power excel program. although this program is a fun, laughter-filled tour through the inside tricks of microsoft excel, people are learning new things along the way. i call them “gasp moments.” imagine this setting: i am in front of 200 managerial accountants that have excel open for 40 hours each week. you can generally figure that these folks are super-efficient with excel. for any trick that i show, it might be already in the arsenal of half to three-quarters of the room. a lot of the people nod their heads, while others look surprised. universally, though, a few tricks get a universal gasp from perhaps 90 percent of the people who did not know the trick and realize just how powerful it is. i thrive on the gasp moments. most people reading this book believe they know excel formulas. to a certain extent, this chapter is a primer for the person who is new to excel. however, even the most astute person using excel should check out these sections of the chapter: • everyone should read the “double-click the fill handle to copy a formula.” somehow, most people have learned to drag the fill handle to copy a formula. this leads to horrible frustration on long data sets, as you go flying past the end of the data. this simple but powerful trick is the one that universally amazes attendees of my seminar.understanding formulas 124 2 part • honestly answer this question: do you really understand the difference between cell h1 and cell h1? if you think the latter has anything to do with currency, you need to review the “overriding relative behavior—absolute cell references” section thoroughly. this isn’t a trick, but one of the fundamental building blocks to building excel worksheets. roughly, 5 percent of the people in a power excel seminar do not understand this concept, and about 30 percent of the people in a community computer club presentation do not understand it. if you don’t know when and why to use the dollar signs, you are in good company with 20 million other people using excel. it is worth taking time to learn this essential technique. • finally, a person’s age can be predicted by how that person enters formulas. there are three ways, and i believe my preferred way is the best. i probably will not convince you to change, but when you understand my way, you can enter formulas far faster than the other two ways. to get a good understanding of the alternatives, read the “three methods of entering formulas” sec-tion later in this chapter. introduction to formulas when microsoft overhauled excel for the 2007 version, a number of formula limits were dramatically increased. for example, the number of characters in a formula increased from 1,024 to 8,192. the number of levels of nesting for if functions increased from 7 to 64. thanks to those improvements, you can calculate almost anything with a formula in excel. this chapter and chapter 9, “controlling formulas,” deal with formula basics. chapters 10, “understanding functions,” through chapter 15, “using trig, matrix, and engineering functions,” introduce adding functions to your formulas. chapter 16, “connecting worksheets, workbooks, and external data,” introduces formulas that calculate data found on other worksheets or in other workbooks. chapter 17, “using super formulas in excel,” provides interesting examples such as 3d formulas and the all-powerful array formulas. because of the record-oriented nature of spreadsheets, you can generally build a formula once and then copy that formula to hundreds or thousands of cells without changing anything in the formula. formulas versus values when looking at an excel grid, you cannot tell the difference between a cell with a formula and one that contains numbers. to see if a cell contains a number or a formula, select the cell. look in the formula bar. if the formula bar contains a number, as shown in figure 8.1, you know that it is a static value. if the formula bar contains a formula, you know that the number shown in the grid is the result of a formula calculation (see figure 8.2). keep in mind that most formulas start with an equal sign. tip designing a formula that can be written once and then copied to a rectangular range of data is a fantastic way to use excel more efficiently.125 enter ing your fi rst formula 8 chapter entering your first formula your first formula was probably a sum function, entered with the autosum button. however, this discussion is talking about a pure mathematic formula that uses a value in a cell, added, subtracted, divided, or multiplied by a number or another cell. billions of variations of formulas can be used. everyday life throws situations at you that can be solved with this type of formula. keep these important points in mind as you start tinkering with your own formulas: • every formula starts with an equal sign. • entering formulas is just like typing an equation in a calculator with one exception (see the next point). • if one of the terms in your formula is already stored in a cell in excel, you can point to that cell’s address instead of typing the number into that cell. using this method allows you to change the value in one cell and then watch all of the formulas recalculate. to illustrate these points, see the steps to building a basic formula included in the following example. building a formula you want to enter a formula to calculate a target sales price, as shown in figure 8.3. cell b2 shows the product cost. in column c, you want to calculate the list price as two times the cost plus 3. static value in the formula bar figure 8.1 the formula bar reveals whether a value is a static number or a calcu-lation. in this case, cell b2 contains a static number. figure 8.2 in this case, cell b2 contains the result of a formula calculation. a formula usually starts with an equal sign.understanding formulas 126 2 part to enter a formula, follow these steps: 1. put the cell pointer in cell c2. 2. type an equal sign. the equal sign tells excel that you are starting a formula. 3. type 2*b2 to indicate that you want to multiply two times the value in cell b2. 4. type 3 to add three to the result. there should be no spaces in the formula. if your formula reads 2*b23, proceed to step 5. otherwise, use the backspace key to correct the formula. 5. press enter. excel calculates the formula in cell c2. by default, excel usually moves the cell pointer down or to the right after you finish entering a formula. you should move the cell pointer back to cell c2 to inspect the formula, as shown in figure 8.3. note that excel shows a number in the grid, but the formula bar reveals the formula behind the number. figure 8.3 the formula in cell c2 recalculates if the value in cell b2 changes. the relative nature of formulas the formula 2*b23 really says, “multiply two by the cell immediately to the left of me and then add three.” if you need to put this formula in cells c3 to c999, you do not need to reenter the for-mula 997 times. instead, copy the formula and paste it to all the cells. as you copy, excel copies the essence of the formula: “multiply two by the cell to the left of me and add three.” as you copy the formula to cell c3, the formula becomes 2*b33. excel handles all this automatically. figure 8.4 shows the formula after it is copied. figure 8.4 after you paste the formula, excel automatically updates the cell ref-erence to point to the current row.127 enter ing your fi rst formula 8 chapter excel’s ability to change b2 to b3 in the formula is called relative referencing. this is the default behavior of a reference. sometimes, you do not want excel to change a reference as the formula is copied, as explained in the next section. overriding relative behavior: absolute cell references relative referencing, which is excel’s ability to change a formula as it is copied, is what makes spreadsheets so useful. at times, however, you need part of a formula to always point at one partic-ular cell. this happens a lot when you have a setting at the top of the worksheet such as a growth rate or a tax rate. it would be nice to change this cell once and have all the formulas use the new rate. the following example sets up a sample worksheet that exhibits this problem and shows how to use an arcane notation style to solve the problem. when you see a reference with two dollar signs, such as g1, this indicates an absolute reference to g1. an absolute reference is a cell or range address where the row numbers and the column letters are locked and will not change during copying. absolute references have a dollar sign before each column letter and each row number. examples include g1 and t2:w99. suppose that you have a sales tax factor in a single cell at the top of a worksheet. after you enter the formula c2*g1, it accurately calculates the tax in cell d2, as shown in figure 8.5. however, when you copy the same formula to cell c3, you get a zero as the result. as you can see in figure 8.6, excel correctly changed cell c2 to c3 in the copied formula. however, excel also changed g1 to g2. because there is nothing in g2, the calculation predicts a zero. figure 8.5 this formula works fine in row 2. formula now points to empty cell g2 figure 8.6 this formula fails in row 3. because the sales tax factor is only in g1, you want excel to always point to g1. to make this hap-pen, you need to build the original formula as c2*g1. the two dollar signs tell excel that you do not want to have the reference change as the formula is copied. the before the g freezes theunderstanding formulas 128 2 part reference to always point to column g. the before the 1 freezes the reference to always point to row 1. now, when you copy this formula from cell d2 to other cells in column d, excel changes the formula to c3*g1, as shown in figure 8.7. to recap, a reference with two dollars signs is called an absolute reference. figure 8.7 the dollar signs in the formula make sure that the copied formula always points to cell g1. using mixed references to combine features of relative and absolute references in a number of situations, you might want to build a reference that has only one dollar sign. for example, in figure 8.8, you want to use the monthly bonus rate in row 3, but you want to allow the column to change. in this case, the formula for cell b19 would be b6*b3. when you copy this formula, it always points to the bonus amount in row 3, but the remaining elements of the formula are relative. for example, the formula in e21 is e8*e3, which multiplies maia’s april sales by the april bonus rate. figure 8.8 by having the dollar sign before the 3 in c3, you lock the reference to row 3 but allow the formula to point to columns d, e, and so on as you copy the formula.129 enter ing your fi rst formula 8 chapter there are two kinds of mixed references. one mixed reference freezes the row number and allows the column letter to change. the other mixed reference freezes the column letter but allows the row number to change. no one has thought up clever names to distinguish between these references, so they are simply called mixed references. to illustrate the other kind of mixed reference, as shown in figure 8.9, say you want a single formula to multiply the daily rate from column a by the number of days in row 4. this formula requires both kinds of mixed references. figure 8.9 you can create a formula by using a combination of dollar signs to allow cell c6 to be copied to all cells in the table. in this case, you want the cell a6 reference to always point to column a, even when the formula is copied to the right. therefore, the a6 portion of the formula should be entered as a6. you also want the c5 portion of the formula to always point to row 5, even when the formula is copied down the rows. therefore, the c5 portion of the formula should be entered as c5. using the f4 key to simplify dollar sign entry in the preceding section, you entered quite a few dollar signs in formulas. the good news is that you do not have to type the dollar signs! instead, immediately after entering a reference, press the f4 key to toggle the reference from a relative reference to an absolute reference, which automati-cally has the dollar signs before the row and column. if you press f4 again, the reference toggles to a mixed reference with a dollar sign before the row number. when you press f4 once again, the references toggle to a mixed reference with a dollar sign before the column letter. pressing f4 one more time returns the reference to a relative reference. you might find it easier to choose the right reference by looking at the various reference options offered by the f4 key. the following sequence shows how the f4 key works while you are entering a formula. this par-ticular example was included because it requires two different types of mixed references. the important concept is that you start pressing f4 after typing a cell reference but before you type a mathematical operator. 1. type a6 (see figure 8.10). 2. before typing the asterisk to indicate multiplication, press the f4 key. on the first press of f4, the reference changes to a6, as shown in figure 8.11.understanding formulas 130 2 part 3. press the f4 key again. the reference changes to a6 to freeze the reference to row 6, as shown in figure 8.12. figure 8.10 this is a relative reference: a6 without any dollar signs. figure 8.11 after you press f4 once, the reference changes to an absolute reference, with two dollar signs. figure 8.12 press f4 again to switch to a mixed reference with the row number locked. 4. press f4 one more time. excel locks just the column, chang-ing the reference to a6, as shown in figure 8.13. this is the version of the reference that you want. figure 8.13 press f4 again to switch to a mixed reference with the column letter locked. 5. to continue the formula, type an asterisk to indicate mul-tiplication and then click cell c5 with the mouse. at the point shown in figure 8.14, you would press f4 twice to change c5 to a reference that locks only the row (that is, c5). 6. press enter to accept the formula. tip if you are having trouble remem-bering whether you want the dollar signs before the row, the column, or both, press f4 and then look at the formula to figure out if you are freezing the correct element—either the row or col-umn. the item immediately after the dollar sign is the part of the address being frozen.131 enter ing your fi rst formula 8 chapter 7. when you copy the formula from cell c6 to the range c6:h36, the formula automatically multi-plies the rate in column a by the number of days in row 5. figure 8.15 shows the copied formula in cell e9. the formula correctly multiplies the 55-dollar rate in cell a9 by the three days figure in cell e5. figure 8.14 after clicking c5 with the mouse, press f4 twice to change this reference. figure 8.15 by using the correct combination of row and column mixed refer-ences, you can enter this formula once and successfully copy it to the entire rectangular range. using f4 after a formula is entered the f4 trick described in the preceding section works immediately after you enter a reference. if you try to change cell a6 after you type the asterisk, pressing the f4 key has no effect. however, you can still use f4 by clicking somewhere in the formula bar adjacent to the characters a6. pressing f4 now adds dollar signs to that reference. using f4 on a rectangular range some functions allow you to specify a rectangular range. for exam-ple, in figure 8.16, you would like to enter a formula to calculate month-to-date sales. one formula in cell c29 is sum(b2:b29). to copy this formula, you need to change the formula to sum(b2:b29). at this point in the figure, you might be tempted to press the f4 key. however, pressing f4 now would convert the reference to the fully absolute range b2:b29. continuing to press f4 would note after you press f4 again, excel returns the reference to the rela-tive state a6. as you continue to press f4, excel toggles between the four modes. it is fine to toggle between them all and then choose the correct one. if you accidentally toggle past the a6 version, just keep pressing f4 until the correct mode comes up again.understanding formulas 132 2 part toggle to b2:b29, then b2:b29, and then b2:b29. excel does not even attempt to go through the other 12 possible combinations of dollar signs to offer b2:b29 eventually. figure 8.16 using f4 at this point will never produce the desired result of b2:b29. figure 8.17 using f4 is tricky when your reference is a rectan-gular range—you must click into the formula. three methods of entering formulas in the examples in the previous sections, you entered a formula by typing it. although you generally need to start a formula by typing the equal sign (or the plus sign), after that point, you have three options: • type the complete formula as described in the previous sections. • type operator keys, but use the mouse to touch cell references. in this book, this is referred to as the mouse method. • type the operator keys, and then use the arrow keys to specify the cell references by navigating to the cells. in this book, this method is referred to as the arrow key method. assume you would like to multiply the merchandise total in cell b2 by the sales tax rate in cell f1, as shown in figure 8.18. in this case, you need to click the insertion point just before, just after, or in the middle of the characters b2 in the formula bar. if you then press f4, toggle through the various dollar sign combinations on the b2 reference. pressing f4 twice results in the proper combination, as shown in figure 8.17.133 three methods of enter ing formulas 8 chapter enter formulas using the mouse method if you started using computers after the advent of microsoft windows 3.1, it is likely that you use the mouse method for entering formulas. this method is intuitive, but it requires you to move your hand between the keyboard and the mouse several times, as in this example: 1. type . 2. click in cell b2. 3. type *. 4. click in cell f1. 5. press f4 to add the dollar signs. 6. press enter. this usually moves the cell pointer to cell f2. this method requires only four keystrokes, but it requires you to move to the mouse twice. moving to the mouse is the slowest part of entering formulas, but this method is easier than typing the entire formula if you are not a touch typist. entering formulas using the arrow-key method the arrow-key method is popular with people who started using spreadsheets in the days of lotus 1-2-3 release 2.2. it is worthwhile to learn this method because it is incredibly fast. almost all formula entry can be accomplishing using keys on the right side of the keyboard. here’s how it works: 1. in cell c2 type . 2. press the left-arrow key to move the flashing cell border to cell b2. note that the active cell, which is the one with a solid border, is still cell c2. the flashing border is like a second cell pointer that you can use to point to the correct cell for the formula. as shown in figure 8.19, the temporary formula in the formula bar reads b2. figure 8.18 you can use three methods to enter the formula b2*f1. tip if you have a desktop keyboard, you can use the asterisk key on the numeric keypad to avoid pressing the shift key. figure 8.19 by using the arrow keys during formula entry, you create a flashing border that can be used to navi-gate to a cell reference.understanding formulas 134 2 part 3. to accept cell b2 as the correct reference in the formula, press either an operator key (for example, *, ), a parenthesis, or the enter key. in this case, type *. 4. note that the dashed cell pointer disappears, and the focus is now back to the original cell, c2. 5. press the right-arrow key three times. the flashing cell border moves to d2, e2, and then f2. with each keypress, the tem-porary formula in the formula bar shows an incorrect formula (b2*d2, b2*e2, and b2*f2). figure 8.20 shows what the screen looks like after you press the right-arrow key three times. 6. type the up-arrow key to move the flashing cell border to the correct location, cell f1. the temporary formula in the formula bar now shows b2*f1. 7. press the f4 key to add dollar signs to the f1 reference. 8. press ctrlenter to accept the formula and keep the cell pointer in cell c2. note as you are moving the flash-ing cell border with the mouse, ignore the formula bar and watch just the flashing cell border. tip even if you are mouse-centric, you should try this method for half a day. when you get the feel for navigating by using the arrow keys, you can enter for-mulas much faster by using this method. figure 8.20 after step 4, the focus moves to the original cell. thus, you only have to press the right-arrow key three times instead of four times to arrive at cell f2. n ote officially, every formula must start with an equal sign. however, to make former lotus 1-2-3 users comfortable, excel allows you to start a formula with a plus sign. power excel users have discovered that using a plus sign allows you to start a formula by typing on the numeric keypad. because i routinely start formulas with the plus sign, i am often asked why i start with instead of just . even though the formulas appear that way onscreen, i don’t actually enter the . when a formula starts with a plus sign, excel adds an equal sign and does not remove the plus sign, so you end up with a formula that looks like b2*f1. using this method requires 10 keystrokes, with no trips to the mouse. you can enter formulas that have no absolute references, mixed references, parentheses, or exponents by using just the arrow keys and the keys on the numeric keypad. to see a video of entering formulas using all three methods, search for excel in depth 8 at youtube.135 enter ing the same formula in many cel l s 8 chapter entering the same formula in many cells so far in this chapter, you have entered a formula in one cell and then copied and pasted to get the formula in many cells. to enter the same formula in many cells, you can use three alternatives: • preselect the entire range where the formulas need to go. enter the formula for the first cell and press ctrlenter to enter the formula in the entire selection simultaneously. • enter the formula in the first cell and then use the fill handle to copy the formula. • beginning with excel 2007, the method is to define the range as a table. when this method is used, the new formulas are copied automatically. copying a formula by using ctrlenter this strategy works when you are entering formulas for one or more screens that are full of data: 1. if you have just a few cells, select them before entering the formula. 2. click in the first cell and drag down to the last cell, as shown in figure 8.21. notice from the name box that the active cell is the first cell. figure 8.21 click in the first cell and drag down to the last cell to select a range with the first cell as the active cell. 3. enter the formula by using any of the three methods described earlier in this chapter. even if you use the arrow-key method, excel keeps the entire range selected. figure 8.22 shows the formula after you press f4 to convert the f1 reference to f1. 4. at this point, you would normally press enter to complete the formula. instead, press ctrlenter to enter this formula in the entire selected range. note that excel does not enter b2*f1 in each cell. instead, it converts the formula as if it were copied to each cell. figure 8.23 shows the formula in cell c20.understanding formulas 136 2 part copying a formula by dragging the fill handle if you want to enter a formula in one cell and then copy it to the other cells in a range, you can use the fill handle, which is the square dot in the lower-right corner of the cell pointer. there are two ways to use the fill handle: • drag the fill handle. • double-click the fill handle. figure 8.22 even with a large range selected, the formula is built only in the active cell. figure 8.23 pressing ctrlenter tells excel to enter the formula in the active cell and to copy it to the rest of the selec-tion.137 enter ing the same formula in many cel l s 8 chapter the dragging method works fine when you have less than one screen full of data: 1. enter the formula in cell b2. 2. press ctrlenter to accept the formula and keep the cell pointer in cell b2. 3. click the fill handle. you know that you are above the fill handle when the mouse pointer changes to a thick plus sign, as shown in figure 8.24. drag the mouse down to the last row of data, which in this case is cell c14. fill handle mouse cursor figure 8.24 you can copy a formula by clicking and dragging the fill handle. 4. when you release the mouse button, the original cell is copied to all the cells in the selected range. this method is fine for copying a formula to a few cells. however, if you have thousands or hun-dreds of thousands of cells, it is annoying to drag to the last row. in excel 2003 and earlier, you would invariably end up flying past the last row. note that excel 2010 will automatically slow down and briefly pause at the last row. however, it is far easier to copy a formula by double-clicking the fill handle. double-click the fill handle to copy a formula in most data sets, double-clicking the fill handle is the fastest way to copy the formula. although you will love this method, you need to understand a few shortcomings that can hamper the method when an adjacent column has blank cells among the data. figure 8.25 shows a table that has hundreds of rows of data. suppose you want to copy the formula from cell c2 down to all the rows of data in column b. in this particular case, cell b2 is nonblank, and column b contains a value in every row down to the end of the data. this is the perfect condition for using the technique of double-clicking the fill handle. follow these steps: 1. enter the formula in cell b2. 2. press ctrlenter to accept the formula and keep the cell pointer in cell b2. 3. double-click the fill handle.understanding formulas 138 2 part the active cell is copied down to the last row of your data, as shown in figure 8.26. figure 8.25 dragging the fill handle is frustrating in a table that has hundreds of rows. the double-click method will end that frustration. figure 8.26 double-click the fill handle to copy to the last row of your data. the fill handle double-click method is fast. however, several arcane rules can trip up the data, particularly if the column to the left contains a blank cell before the end of the data. excel will be tricked into stopping the copy early because of this blank cell. in addition, a couple of other rules exist: • the fill handle can copy based on data in the adjacent column to the right, or even data in the current column. • you should always type end and then press the down arrow to make sure that the formula was copied far enough. for a complete discussion of the arcane rules, see the excel “ troubleshooting tip ” section at the end of this chapter. use the table tool to copy a formula the table tool was improved dramatically in excel 2007. when using this tool, if you tell excel that your current data set is a table, excel automatically copies new formulas down to the rest of the cells in the table.139 use the table tool to copy a formula 8 chapter figure 8.27 shows an excel worksheet that has headings at the top and many rows of data below the headings. figure 8.27 this is a typical worksheet in excel. to define a range as a table, select a cell within the data set and type ctrlt. excel uses its intellisense to guess the edges of the table. if its guess is correct, click ok in the create table dia-log, as shown in figure 8.28. figure 8.28 the create table dialog. ctrlt is one of four entry points for creating a table. you can still use the excel 2003 shortcut of ctrll. you can choose format as table on the home tab. you can choose the table icon from the insert tab. as shown in figure 8.29, after excel recognizes the range as a table, several changes occur: • the table is formatted with the default formatting. depending on your preferences, this might include banded rows or columns. • autofilter drop-downs are added to the headings. • any formulas that you enter use the headings to refer to cells within the table. note as shown in figure 8.30 , there is a lightning bolt drop-down to the right of cell d3. this drop-down offers you the opportunity to stop excel from automatically copying the formula down. figure 8.29 defining a range as a table provides formatting and powerful features such as autofilters and natural lan-guage formulas.understanding formulas 140 2 part troubleshooting tip: overcoming the arcane rules for fill handle double-clicking the fill handle double-click trick is one of the best and least-known tricks that is sure to get a bunch of gasps in my excel seminars. excel has an arcane set of rules it follows when you double-click the fill handle. the rules changed in excel 2010 to make the operation work even better. in previous versions of excel, the fill handle would first check if the cell immediately below is non-blank. in figure 8.31 , the nonblank cell in b3 means that the fill handle would copy the formula in b2 down to b7. if the cell just below is empty, then excel looks to the left. if you double-click the fill handle in f2, excel 2007 copies the formula down to row 12 based on the contiguous range in e2:e12. if the cell to the left is empty, excel 2007 looks to the column on the right. double-click the fill handle in j2, and excel copies the formula down to row 10 based on the data in k2:k10. now when you enter a formula in the table, excel automatically copies that formula down to all rows of the table. figure 8.30 thanks to the new table tool in excel 2007, a new formula entered anywhere in column d is copied automatically to all the cells in column d. figure 8.31 in excel 2007 and earlier, excel would look immediately below, then left, and then right of the active cell.141 use the table tool to copy a formula 8 chapter in excel 2010, the cell immediately below the active cell still trumps all other cells. in figure 8.32 , if you double-click the fill handle in c2, excel will see the nonblank cell in c3. in that figure, the formula will copy down to row 7 because of the filled cells in c3:c7. if excel 2010 does not encounter a value immediately below the active cell, it looks at the seven cells shaded in figure 8.33 . provided any of those cells are nonblank, excel will launch into a new, powerful current-region algorithm. excel understands the concept of a current region . if you are in a cell and press ctrl*, excel extends the selection in all directions until it encounters the edge of the worksheet or a completely blank edge of the data. this concept allows excel to extend past single blank cells and select the whole data set. in excel 2010, excel looks at the current region around the active cell. in figure 8.33 , the current region is a1:f12. excel copies the formula in c2 down to row 12 because that is how far the current region extends. this is better than the excel 2007 algorithm that would have been fooled into stop-ping the copy at c5 by the blank cell in b6. figure 8.32 in excel 2010, if the cell below the active cell is nonblank, excel copies the cell until it encoun-ters a blank cell in the current column. figure 8.33 if any of the seven shaded cells around c2 are nonblank, excel uses the entire current region to determine the last row.understanding formulas 142 2 part this allows excel to find the true bottom of a column in some remarkably sparse data sets. in figure 8.34 , b2, c3, and d2 are all empty. this normally would have prevented the fill handle from copying the formula. however, in excel 2010, excel follows the path of c2 – d1 – e1 – f2 – f8 – e9 – e12 to find that the true bottom of the data set is row 12. it is good to understand the arcane rules involved in the fill handle double-click trick. although it works most of the time in excel 2010, be aware that if you are on a computer with excel 2007, the old rules will apply. figure 8.34 excel 2010 accurately copies the formula down to c12.9 controlling formulas although you can go a long way with simple formulas, it is also possible to build extremely powerful formulas. the topics in this chapter explain the finer points of formula operators, date math, and how excel distinguishes between cutting and copying cells referenced in formulas. formula operators excel offers the mathematical operators shown in table 9.1. table 9.1 mathematical operators operator description addition - subtraction * multiplication / division or fractions exponents () overriding the order of operations - unary minus (for negative numbers) & joining text (concatenation) greater than less than greater than or equal tocont rol l ing formulas 144 1 part operator description less than or equal to not equal to equal to , union operator as in sum(a1,b2) : range operator as in sum(a1:b2) space intersection operator as in sum(a:j 2:4) order of operations when a formula contains many calculations, excel evaluates the formula in a certain order. rather than calculating from left to right as a calculator might, excel performs certain types of calculations, such as multiplication, before calculations such as addition. you can override the default order of operations with parentheses. if you do not use parentheses, excel uses the following order of operations: 1. unary minus is evaluated first. 2. exponents are evaluated next. 3. multiplication and division are handled next, in a left-to-right manner. 4. addition and subtraction are handled next, in a left-to-right manner. the following sections provide some examples of order of operations. unary minus example the unary minus is always evaluated first. think about when you use exponents to raise a number to a power. if you raise –2 to the second power, excel calculates –2 –2, which is 4. so the formula –22 evaluates to 4. if you raise –2 to the third power, excel calculates (–2) (–2) (–2). multiplying –2 by –2 results in 4, and multiplying 4 by –2 results in –8. so the simple formula –23 generates –8. you need to understand a subtle but important distinction. when excel encounters the formula –23, it evaluates the unary minus first. if you want the exponent to happen first and then have the unary minus applied, you have to write the formula as –(23). however, in a for-mula such as 100–23, the minus sign is considered to be a subtraction operator and not a unary minus sign. in this case, 23 is evaluated as 8, and then 8 is subtracted from 100. figure 9.1 shows the results of three formulas involving raising –2 to various powers. note to see how excel calculates the formulas you enter, first enter a formula in a cell. next, from the formulas tab, select formulas, formula auditing, evaluate formula to open the evaluate formula dialog and watch the formula calculate in slow motion.145 formula operators 9 chapter addition and multiplication example the order of operations is important when you are mixing addition/subtraction with multiplication/ division. for example, if you want to add 20 to 30 and then multiply by 1.06 to calculate a total with tax, the following formula leads to the wrong result: 2030*1.06 the result you are looking for is 53. however, the evaluate formula dialog shows that excel calcu-lates the formula 2030*1.06 like so (see figure 9.2): 1.06 30 31.8 31.8 20 51.8 figure 9.1 beware of the unary minus. three very similar formulas have dif-ferent results, depending on where the minus sign occurs. figure 9.2 the underline indicates that excel does the multiplication first. excel’s answer is 1.20 less than expected because the formula is not written with the default order of operations in mind. to force excel to do the addition first, you need to enclose the addition in parentheses: (2030)*1.06 figure 9.3 shows the second step in the evaluate formula dialog for this formula. the addition in parentheses is done first, and then 50 is multiplied by 1.06 to get the correct answer of 53.cont rol l ing formulas 146 1 part stacking multiple parentheses if you need to use multiple sets of parentheses when doing math by hand, you might write math formulas with square brackets and curly braces, like this: {3-[6*4*3-(3-6)2]/27}*14 in excel, you use multiple sets of parentheses, as follows: (3-(6*4*3-(3-6)2)/27)*14 formulas with multiple parentheses in excel are confusing. excel does two things to try to improve this situation: • as you type a formula, excel colors the parentheses in a set order: black, forest, purple, brown, green, orange, violet, blue, forest, purple, brown, green, orange, and so on. if you bump the zoom up on your screen to 400%, you might actually be able to make out the color differences in the parentheses. however, at the normal zoom, the colors black, green, purple, and brown are all dark, and it is difficult to tell one from the other. • when you type a closing parenthesis, excel shows the open-ing parenthesis in bold for a fraction of a second. this would be more helpful if excel kept the opening parenthesis in bold for 5 seconds or 20 seconds. however, when it is displayed for only about half a second, it is nothing more than a frustrating reminder that your reflexes are not fast enough. figure 9.4 attempts to show the bolded condition at a ridiculously high zoom. figure 9.3 excel evaluates the operation in parentheses first. tip remember that the first and last parentheses will always be black. excel might repeat various shades of brown, orange, green, and purple, but it never reuses black as a parentheses color. this means if your last parenthe-sis in the formula is not black, you have the wrong number of parentheses. out of frustration with excel’s inability to highlight the match-ing parenthesis, i usually resort to using notepad to make under-standing complicated formulas easier. to do this, select the cell. in the formula bar, drag to select the entire formula, and then press ctrlc to copy it. next, open a new notepad window and paste the copied formula into notepad. as shown in figure 9.5 , you can add line breaks and spaces in notepad to visualize the formula.147 understanding er ror messages in formulas 9 chapter understanding error messages in formulas don’t be frustrated when a formula returns an error result. this eventually happens to everyone. the key is to understand the difference between the various error values so that you can begin to troubleshoot the problem. as you enter formulas, you might encounter a number of errors, including those listed next: • #value!—this error indicates that you are trying to do math with nonnumeric data. for exam-ple, the formula 4“apple” returns a #value! error. this error also occurs if you try to enter an array formula, but fail to use ctrlshiftenter, as described in chapter 17, “using super formulas in excel.” • #div/0!—this error occurs when a number is divided by zero—that is, when a fraction’s denomi-nator evaluates to zero. • #ref!—this error occurs when a cell reference is not valid. for example, this error can occur if one of the cells referenced in the formula has been deleted. it can also occur if you cut and paste another cell over a cell referenced in this formula. you may also get this error if you are using dynamic data exchange (dde) formulas to link to external systems and those systems are not running. matching parentheses in bold figure 9.4 this screen shot shows that after you type the fifth closing paren-thesis, the tenth opening parentheses is briefly shown in bold. figure 9.5 you copy the formula from the formula bar and paste to a text editor, where you can actually break the for-mula into components.cont rol l ing formulas 148 1 part • #n/a!—this error occurs when a value is not available to a function or a formula. #n/a! errors most often occur because of key values not being found during lookup functions. they can occur as a result of hlookup, lookup, match, or vlookup. they can also result when an array formula has one argument that is not the same shape as the other arguments or when a function omits one or more required arguments. interestingly, when a #n/a! error enters a range, all subse-quent calculations that refer to the range have a value of #n/a!. • ######—this is not really an error. instead, it means that the result is too wide to display in the current column width, so you need to make the column wider to see the actual result. in figure 9.6, cell e17 is a simple sum function. it is returning a #n/a! error because cell e11 contains the same error. cell e11 contains the formula d11*c11. the root cause of the problem is the vlookup function in cell d11. because dill cannot be found in the product table in g7:h9, the vlookup function returns #n/a!. figure 9.6 shows only a small table, so it is relatively easy to find the earlier #n/a! errors. however, if you were totaling 100,000 rows, it can be difficult to find the one offending cell. to track down errors, follow these steps: 1. select the cell that shows the final error. to the left of that cell, you should see an exclamation point in a yellow diamond. caution whereas ###### usually means the column is not wide enough, excel also uses this symbol to indicate that you are subtracting a later date from an earlier date. if making the column wider does not clear the #### signs, check the formula bar to see if your for-mula might be subtracting dates. figure 9.6 the error in e17 is actually caused by an error two calculations earlier. 2. hover the cursor over the yellow diamond to reveal a drop-down arrow. 3. from the drop-down menu, select trace error. excel draws in red arrows pointing back to the source of the error, as shown in figure 9.7. for example, from the original #n/a! error in cell d11, blue arrows demonstrate what cells were causing the error. 4. repeat steps 1-3 for the cell causing the error to trace the original root cause of the problem.149 us ing formulas to join text 9 chapter using formulas to join text you use the ampersand (&) operator when you need to join text. in excel the & operator is known as the concatenation operator. for example, the formula a2&b2 joins the text values shown in the two cells a2 and b2, as shown in figure 9.8. figure 9.7 selecting trace error reveals the cells leading to the error. figure 9.8 the & operator is used to join two text cells. when using the & operator, you might want to include a space between the two items that are combined to improve the appear-ance of the output. for example, if the cells contain first name and last name, you might want to have a space between the names. to include a space between cells, you follow the & with a space enclosed in quotes, such as &“ ”. as shown in figure 9.9, the for-mula a2&“ ”&b2 generates a better-looking result than a2&b2. to watch a video of joining text, search for excel in depth 9 at youtube . caution when you enter the formula in figure 9.9 , you have to hold down the shift key to enter the quotation marks in “ ”. many excel users accidentally hold down the shift key while press-ing the spacebar. however, shiftspacebar is the excel shortcut for selecting an entire row. if your formula changes to a2&”a:a because you pressed shiftspacebar, you can press the esc key and start over.cont rol l ing formulas 150 1 part joining text and a number in many cases, you can use the & operator to join text with a number. in figure 9.10, the formula in cell c2 joins the words “the price is ” with the result of the calculation in cell b2. because cell b2 contains an integer with no special formatting, the answer appears correctly. figure 9.9 you can join cells with any text in quotation marks. figure 9.10 joining text with a number works if the number is formatted with gen-eral formatting. in figure 9.11, cell b2 is formatted to display a currency symbol and two decimal places. when you join this value to text in cell c2, excel ignores the formatting in cell b2 and shows the result with all the decimal places. a similar problem exists when you want to join text with a date. excel ignores the fact that the text in cell b4 is formatted as a date and shows the underlying value in cell c4. figure 9.11 joining text with a date or with formatted numbers rarely works well. in this case, you need to discover the numeric formatting code associated with the original cell. to do so, follow these steps: 1. select cell b2. 2. press ctrl1 to display the format cells dialog.151 copying versus cut t ing a formula 9 chapter 3. on the number tab, choose the custom category. this reveals the actual formatting codes for the cell. as shown in figure 9.12, the actual formatting code for b2 is #,##0.00. if you repeat these steps for cell b4, you learn that the actual formatting code is m/d/yyyy. figure 9.12 if you choose the custom category, you learn the actual codes used to produce the numeric format. for a complete discussion of numeric formatting codes, see chapter 30 , “ formatting worksheets.” when you know the numeric formatting codes, you can achieve the desired effect by using those codes, enclosed in quotation marks, as the second argument of the text function. in figure 9.13, the formula in cell c2 is “the answer is “&text(b2, “#,##0.00”). you can use your knowledge of custom numeric formatting codes to change the look of the joined value. in cell c4, the new formula is “his birthday is “&text(b4, “dddd mmmm d, yyyy”). figure 9.13 use the text function to replicate formatting, as in cell c2, or to change the formatting, as in cell c4. copying versus cutting a formula in figure 9.14, the formula in cell c7 references a7b7. because there are no dollar signs within the formula, those are relative references. to learn more about relative versus absolute references, see chapter 10 , “understanding functions.”cont rol l ing formulas 152 1 part if you copy cell c7 and paste it to cell g3, the formula works perfectly, as shown in figure 9.15. figure 9.14 the formula in cell c7 adds the two numbers to the left of the formula. figure 9.15 when cell c7 is copied to cell g3, the formula still adds the two numbers to the left of the formula. however, if you cut cell c7 and paste it to a new location, the formula continues to point to cells a7b7, as shown in figure 9.16. whereas cutting and copying are relatively similar in applications such as word, they are very different in excel. it is important to understand the effect of cutting a formula in excel in contrast to copying the formula. when you cut a formula, the formula continues to point to the original precedents, no matter where you paste it. a similar rule applies to the references mentioned in a formula. for example, the formula in cell c7 points to a7 and b7. as long as you copy cell a7 and/or cell b7, you can paste them anywhere153 copying versus cut t ing a formula 9 chapter without changing the formula in c7. figure 9.17 shows the result of copying a7:b7 for 20 rows: nothing changes in the formula. figure 9.16 using cut and paste on a formula forces the formula to continue to point to the original cells. figure 9.17 copying the precedent cells from cell c7 does not have any effect on cell c7. however, if you cut and paste a7:b7 to a new location such as e5:f5, the formula in cell c7 changes. after the paste, the formula points to the new location of the pasted cells, as shown in figure 9.18. figure 9.18 cutting the precedent cells from cell c7 causes the formula in c7 to change to reflect the new location.cont rol l ing formulas 154 1 part automatically formatting formula cells the rules for formatting the result of a formula seem to be inconsistent. suppose that you have 1.23 in cell a1, as shown in figure 9.19. figure 9.19 in this case, all cells are in general format, except cell a1. if you enter a13 in another cell with general format, the result automatically inherits the currency format of cell a1, as shown in figure 9.20. note that the formatting is copied whether you use a13 or 3a1. figure 9.20 no formatting is applied to b3. instead, excel copies the for-mat from cell a1 automatically. in figure 9.20, cell b3 is formatted automatically to match the only cell mentioned in the formula. it becomes harder to predict the automatic format when your formula refers to several cells, each with a different format. further, the result will change if you start your formula with a plus sign instead of an equal sign. many people use the plus sign because it is easy to type on the numeric keypad. however, microsoft probably considers using the plus sign as a lotus transition issue and applies different rules. in figure 9.21, cell a1 is formatted as currency with two decimal places. cell a3 is formatted as number with three decimal places. cell a5 is formatted as percent with no decimal places. cells in columns c and f all add the three original cells. each formula specifies a1, a3, and a5 in a different sequence. formulas in c were entered starting with a plus sign. formulas in f were started with an equal sign. the resulting automatic format does not appear to follow any pattern. when you use an equal sign, either the format is copied from the first or the last cell referenced. when you use a plus sign, the format sometimes comes from the second, first, or last reference; and sometimes the format is a mix of two references. if your formula is going to refer to multiple cells with different formatting, start the formula with an equal sign. refer to the cell with the desired cell format first, but accept that you might have to explicitly format the resulting cell.155 us ing date math 9 chapter using date math dates in excel are stored as the number of days since january 1, 1900. for example, excel stores the date feb-17-2011 as 40591. in figure 9.22, cell a1 contains the date. cell a2 contains the formula a1 and has been formatted to show a number. figure 9.21 when a formula refers to cells with different formats, the resulting cell’s format varies depending on which cell was mentioned first or last. figure 9.22 although cell a1 is formatted as a date, excel stores the date as the number of days since january 1, 1900. this convenient system allows you to do some pretty simple math. for example, figure 9.23 shows a range of invoice dates in column b. the terms for the invoice are in column d. you can calcu-late the due date by adding cells b2 and d2. here is what actually happens in excel’s calculation engine: 1. the date in cell b2—2/1/2011—is stored as 40575. 2. excel adds 10 to that number to get the answer 40585. 3. excel formats this number as a date, to yield 2/11/2011. figure 9.23 when the answer is formatted correctly, excel’s date math is very cool.cont rol l ing formulas 156 1 part however, a frustrating problem can occur if the cell containing the formula has the wrong numeric format. for example, in figure 9.24, the workday function in column d did not automatically convert the result to a date. it is important to recognize that dates in 2010–2013 fall in the range of 40,179 to 41,639. so, if you are expecting a date answer as the result of a formula and get a number in this range, the answer probably needs to have a date format applied. figure 9.24 the formula appears to give the wrong answer. however, this is a formatting problem. to apply a date format, on the home tab use the number drop-down to choose the date format. the answer in column d now appears correctly, as shown in figure 9.25. in general, most formulas that refer to a date cell will automatically be formatted as a date. most formulas that contain functions from the date category will be formatted as a date. (the workday function is one annoying exception). however, sometimes you do not want the result formatted as a date. figure 9.25 after you apply the date format, the answer is displayed correctly. for example, in figure 9.26, you would like to count the number of days between two dates. the formula in c4 should calculate 16 days. however, because excel automatically formatted the result as a date, the answer of 16 is shown as 1/16/1900. in this case, you need to apply a numeric format to the range containing the formula. when you select number from the number drop-down on the home tab, the formula appears with the correct result (see figure 9.27).157 troubleshoot ing formulas 9 chapter anytime you are doing math between two dates, you should plan to change the format of the result to be either number or date, depending on the situation. to learn about many other useful date functions in excel, see chapter 11 , “using everyday functions: math, date and time, and text functions.” figure 9.26 in this case, you want the answer to be formatted as a number instead of a date. figure 9.27 when you apply a numeric format, the answers are correct. troubleshooting formulas it is difficult to figure out worksheets that were set up by other people. when you receive a worksheet from a co-worker, use the information in the following sections to find and examine the formulas. highlighting all formula cells the first technique to use when examining a new worksheet is to find all the cells that contain for-mulas. the following steps will identify all the formula cells in the worksheet:cont rol l ing formulas 158 1 part 1. ensure that you have a single cell selected. 2. press f5 to display the go to dialog. 3. in the lower-left corner of the go to dialog, click the special button to display the go to special dialog. 4. in the go to special dialog, select the formulas option button, as shown in figure 9.28. click ok to select all formula cells. figure 9.28 the go to special dialog has many incredibly powerful features. 5. to highlight all the cells that contain formulas quickly, use the paint bucket icon in the home tab to mark all the formula cells, as shown in figure 9.29. seeing all formulas for a long time, excel has given users the ability to see all the formulas in a worksheet. the mode that provides this functionality is called show formulas mode. on most u.s. keyboards, the key just below the esc key in the upper-left corner of the keyboard contains a tilde () and also a backtick (). in previous versions of excel, you had to press ctrl to toggle into and out of show formulas mode, in which each column is a little wider. instead of show-ing the results of the formula, in show formulas mode, each cell shows the formula itself (see figure 9.30). this keystroke had to be pressed again to return to regular mode. in addition to recognizing ctrl, excel 2010 offers an icon in the formula auditing section of the formulas tab that enables you to toggle into and out of show formulas mode.159 troubleshoot ing formulas 9 chapter cells containing formulas figure 9.29 immediately after selecting all formulas, select the paint bucket icon to color the formula cells. figure 9.30 show formulas mode shows the formulas in the cells instead of the results. this is handy for quick formula auditing.cont rol l ing formulas 160 1 part editing a single formula to show direct precedents it is helpful to identify cells that are used to calculate a formula. these cells are called the prec-edents of the cell. a cell can have several levels of precedents. in a formula such as d5d7, there are two direct prec-edents: d5 and d7. however, all the direct precedents of d5 & d7 are second-level precedents of the original formula. if you are interested in visually examing the direct precedents of a cell, follow these steps: 1. select a cell that has a formula. 2. press f2 to put the cell in edit mode. in this mode, each reference of the formula is displayed in a different color. for example, the formula in cell h5 refers to three cells. the characters f5 in the formula appear in blue and correspond to the blue box around cell f5 in figure 9.31. 3. visually check the formula to ensure that it is correct. figure 9.31 editing a single formula lights up the direct precedent cells. using formula auditing arrows if you have a complicated formula, you might want to identify direct precedents and then possibly second- or third-level precedents. you can have excel draw arrows from the current cells to all cells that make up the precedents for the current cell. to have excel draw arrows, follow these steps: 1. from the formula auditing group on the formulas tab, click trace precedents. excel draws arrows from the cur-rent cell to all the cells that are directly referenced in the formula. for example, in figure 9.32, an arrow is drawn to a worksheet icon near cell b30. this indicates that at least one of the precedents for this cell is on another worksheet. 2. click trace precedents again. excel draws arrows from the precedent cells to the precedents of those cells. these are the second-level precedents of the original cell. figure 9.33 shows the results of clicking trace precedents five times. practically every cell on the worksheet is a precedent of cell d32. 3. to remove the arrows, use the remove arrows icon in the formula auditing group. tip although it may be impossible to follow the lines for all five levels of precedents in figure 9.33 , this figure indicates that the formula in d32 is vastly more complex than the six cells mentioned in the formula. to have a thorough understanding of what your co-worker built in this worksheet, you need to follow the logic through two dozen cells. in other words, you should not delete any of those cells if you want the formula to continue to calculate a correct result.161 troubleshoot ing formulas 9 chapter figure 9.32 the results of trace precedents for cell d32. figure 9.33 precedents traced five times to find every precedent of the formula.cont rol l ing formulas 162 1 part tracing dependents the formula auditing section provides another interesting option besides the ones discussed so far in this chapter. you can use the formula auditing section to trace dependents so you can find all the cells on the current worksheet that depend on the active cell. before deleting a cell, consider clicking trace dependents to determine whether any cells on the current sheet refer to this cell. this will prevent many #ref! errors from occurring. using the watch window if you have a large spreadsheet, you might want to watch the results of some distant cells. you can use the watch window icon in the formula auditing section of the formulas tab to open a floating box called the watch window screen. to use the watch window screen, follow these steps: 1. click the add watch icon. the add watch dialog appears. 2. in the add watch dialog, specify a cell to watch, as shown in figure 9.34. after you add several cells, the watch window screen floats above your worksheet, showing the current value of each cell that was added to it. the watch window identifies the current value and the current for-mula of each watched cell. in theory, this feature can be used to watch a value in a far-off section of the worksheet. caution even if tracing dependents does not show any cells that are dependent on the current cell, other cells on other worksheets or on other workbooks might rely on this cell. tip to jump to a watched cell quickly, you can double-click the cell in the watch window screen. figure 9.34 adding a watch to the watch window screen. evaluate a formula in slow motion most of the time, excel calculates formulas in an instant. it will help your understanding of the for-mula if you could watch it being calculated in slow motion. if you need to see exactly how a formula is being calculated, follow these steps:163 troubleshoot ing formulas 9 chapter 1. select the cell that contains the formula in which you are interested. 2. on the formulas tab, in the formula auditing group, select evaluate formula. the evaluate formula dialog appears, showing the formula. the following component of the formula is high-lighted: it is the next section of the formula to be calculated. 3. if desired, click evaluate to calculate the highlighted portion of the formula. 4. click step in to begin a new evaluate section for the cell references in the underlined portion of the formula. figure 9.35 shows the evaluate formula dialog after stepping in to the e30 portion of the formula. figure 9.35 the evaluate formula dialog allows you to calculate a formula in slow motion. evaluating part of a formula when you do not need to evaluate an entire formula, use the evaluate formula feature. follow these steps to evaluate part of a formula: 1. use the mouse to select just the desired portion of the formula in the formula bar, as shown in figure 9.36. figure 9.36 you can select a portion of the formula in the for-mula bar. 2. press f9. excel calculates just the highlighted portion of the formula, as shown in figure 9.37. be sure to press the esc key to exit the formula after you use this method. instead, if you press enter to accept the formula, that portion of the formula permanently stays in its calculated form, such as 0.407407. figure 9.37 press f9 to calculate just the highlighted portion of the formula.cont rol l ing formulas 164 1 part excel in practice: moving the formula tooltip as you type a formula, excel 2010 offers a tooltip to show the order of the arguments for a function. as shown in figure 9.38, this tooltip can frequently get in the way by covering nearby cells. a cool trick is to click the tooltip and drag it to a new loca-tion. as you continue entering the formula, the tooltip stays in the new, detached location (see figure 9.39). tip if you click on the function name in the tooltip, excel opens the help topic for that function. figure 9.38 the tooltip is covering adjacent cells. figure 9.39 drag the tooltip to an out-of-the way loca-tion.10 understanding functions excel is used on 500 million desktops around the world. people from all careers use excel, as do many home users who use excel’s powerful features to track their finances, investments, and more. part of excel’s versatility is its wide range of built-in functions. excel 2003 offered 255 built-in functions. another 89 functions shipped with excel but were available only to people who installed the analysis toolpak (atp). excel 2010 now offers 400 built-in functions. although this sounds like a tremen-dous increase, only about seven functions will be of interest to most people: • excel 2010 adds the new aggregate function, which is similar to subtotals but with 19 calculations instead of 11. • excel 2007 added the iferror function to reduce #n/a and div/0 errors. • excel 2007 introduced plural averageif, sumifs, countifs, and averageifs functions. the plural versions handle multiple conditions and are amazingly fast. if you’ve ever used sumproduct because sumif handles only one condition, you will love sumifs. what are the other new functions? they include the following: • international versions of networkdays.intl, workdays.intl, iso. ceiling, and iso.floor. the former workday functions assumed a work week of monday through friday. the new versions can handle workweeks in which the weekend is two other days. • thirty-eight statistical distribution functions. some of these are truly new, and some of them are renamed versions of existing functions. in the past, some functions returned a left-tailed point distribution function and oth-ers returned a left-tailed cumulative distribution function. some offered an inverse function and others did not. microsoft standardized the naming of the functions, offering consistent naming for the distribution function, for the inverse function, and so on. the old function names are still supported for compatibility with legacy versions of excel.understanding funct ions 166 1 part • fourteen other statistical functions. new ways to calculate rank, mode, percentile, and quartile. consistent naming for var and stddev. • eighty-nine atp functions, which are now part of the default excel installation. • seven new cube functions that are useful to people who connect to a multidimensional database, such as sql server analysis services . microsoft also invested heavily to improve the accuracy of several functions. over the years, vari-ous academic papers had noted that excel was doing a poor job when the input values for certain functions started to get out into extreme ranges. for this reason, microsoft hired three mathematical consulting firms to assist in this process. two firms (frontline systems and numerical algorithms group) were hired to propose new algorithms. a third firm, scienceops, considered the competing algorithms from the first two firms to judge which algorithm should be adopted. what does this mean? it means that for certain input values, the following functions will return improved values in excel 2010. it doesn’t mean that every answer will be different. however, if you have a worksheet that is doing calculations near the numerical limits of the function, you might get different results when compared to computers running excel 2007 or earlier. • statistical distribution functions with improved accuracy include: betadist, betainv, binomdist, critbinom, chidist, chiinv, expondist, fdist, finv, gammadist, gammainv, hypgeomdist, lognormdist, loginv, negbinomdist, normdist, norminv, normsdist, normsinv, poisson, tdist, tinv, and weibull. • financial functions with improved accuracy include cumipmt, cumprinc, ipmt, irr, pmt, ppmt, and xirr. • math functions with improved accuracy include asinh, convert, erf, erfc, gammaln, geomean, mod, rand, stdevs, and vars. the functions available in excel 2010 are applicable to a wide range of industries. financial functions help investors, bankers, and bond traders. math and statistical functions help scientists. there are engineering functions for engineers and general-purpose functions for everyone. no matter what you are trying to do in excel, there are functions for you. if you cannot find a built-in function, there is a good chance that a third-party vendor sells an add-in program to excel that adds new customized functions to excel to assist in your par-ticular industry. if not, you can pick up a book on programming vba to learn how to write your own custom functions in excel. refer to chapter 4 of vba and macros for microsoft excel 2010 by jelen and syrstad (que, isbn 0789743140) to learn about the 30 cool functions that you can add to excel. tip did you realize that function names could include a period? there were three functions in excel 2003 that had periods: register.id, sql.request, and error.type. you might have never encountered these functions previously. in excel 2010, 56 of the new statistical functions are implemented with a period, so you should get used to seeing functions like stdev.s, var.p, percentile.exc, and more.167 working wi th funct ions 10 chapter working with functions to use functions successfully in a worksheet, you need to follow the function syntax. keep in mind that a formula that makes use of a function needs to start with an equal sign. you type the function name, an opening parenthesis, function arguments (separated by commas), and the closing parenthesis. the general syntax of a function looks like this: functionname(argument1,argument2,argument3) in general, there should be no spaces anywhere in a function. specifically, you should never use a space between the function name and the opening parenthesis. some people like to add a space after each comma in a function, like this: functionname(argument1, argument2, argument3) although this is not required, it does increase the readability of the final function. for what it’s worth, excel correctly calculates a formula with or without these spaces, so it’s a personal choice as to whether you include them. parentheses are needed with every function, including functions that require no arguments. for example, these functions still require the parentheses: now() date() today() the arguments for a function should be entered in the correct order, as specified in this book or excel help. for example, the pmt() function expects the arguments to have the interest rate first, followed by the number of periods, followed by the present value. if you attempt to send the arguments in the wrong order, excel will happily calculate the wrong result. in many cases, you can enter arguments as numbers or as cell ref-erences. for example, all these formulas are valid: sum(1,2,32,4/5,6*7) sum(a1:a9,c1,d2,sheet2!e3:m10) sum(a1:a9,100,200,b3*5) the formulas tab in excel 2010 one way to find functions in excel 2010 is on the formulas tab. this tab offers function wizard, autosum, recently used, financial, logical, text, date & time, lookup & reference, math & trig, and more functions icons. note excel functions can return a number of errors. this happens most frequently when one of the arguments passed to the func-tion is outside the range of what the function expects. when you receive a #num! , #value! , or #n/a! error, you should look in excel help for the function. the remarks section usually indi-cates exactly what problems can causes each type of error. note chapters 11 through 15 cover all of the 400 functions. this chap-ter covers a number of the most commonly used functions and the new functions in excel 2010.understanding funct ions 168 1 part as shown in figure 10.1, when you click the more functions icon, a drop-down with five additional function groups—statistical, engineering, cube, information, and compatibility—appears. figure 10.1 the formulas tab contains icons for finding functions. the formulas tab is designed to make it easier to find the right function. you select an icon from the ribbon, and an alphabetical list of functions in that group appears. if you hover your mouse over a function in the list, excel displays a description of what the function does, as shown in figure 10.2. figure 10.2 hover over a function, and excel displays a tip explaining what the function does. finding the function you need the inherent problem with the formulas tab is that you often have to guess where your desired function might be hiding. the function categories have been established in excel for a decade, and in some cases, functions are tucked away in strange places. for example, the sum() function is a math & trig function. this makes sense because adding numbers is clearly a mathematical process. however, the average() function is not available in the math & trig icon. (it is under more functions, statistical.) the count() function could be math, reference, or information, but it is found under more functions, statistical. by dividing the list of functions up into categories, microsoft has made it rather difficult to find certain functions. fortunately, as described in the following sections, you can use some tricks to make this process simpler.169 working wi th funct ions 10 chapter using autocomplete to find functions one feature in excel 2010 is formula autocomplete. sometimes you might remember the first letter of a function but not all the rest of the letters. for example, there are five varieties of the function you use to do averages, and they all start with a. rather than trying to figure out whether the averaging function you need is in the math or statistical icon, you can just start typing av in a cell. excel displays a pop-up window with all the functions that begin with av, as shown in figure 10.3. figure 10.3 rather than use the icons on the formulas tab, you can type av to display an alphabetical list of the av functions. to accept a function name from the list, you can either double-click the function name or select the name and press tab. using the function wizard to find functions at the bottom of every list of functions is an icon for the insert function. to access the insert function dialog, you can also use the small fx button to the left of the formula bar, the more functions option at the bottom of the autosum drop-down, or the large function wizard button on the formulas tab. with 15 ways to access the function wizard, microsoft is telling you that this is a good way to find functions. choosing any of these options to open the function wizard causes the insert function dialog to appear. in the excel 2003 version of the function wizard, microsoft added a handy search utility. for example, if you typed car payment and then clicked go, excel would suggest pmt (the correct function) as well as ppmt, ispmt, rate, and others. the search functionality was a fantastic addition to excel 2003 and should be your first stop when trying to find a function in excel 2010. when you choose a function in the insert function dialog, the dialog displays the syntax for the function, as well as a one-sentence description of the function, as shown in figure 10.4. if you need more details, you can click the help on this function hyperlink in the lower-left corner of the insert function dialog.understanding funct ions 170 1 part getting help with excel functions every excel function has three levels of help: • in-cell tooltip • function arguments dialog • excel help the following section discuss these levels of help. however, you are sure to find the function arguments dialog to be one of the best ways of getting help. using in-cell tooltips in any cell, you can type an equal sign, a function name, and the opening parenthesis. excel displays a tooltip that shows the expected arguments. in many cases, this tooltip is enough to guide you through the function. for example, i can usually remember that the function for figuring out a car loan payment is pmt(), but i can never remember the order of the arguments. the tooltip, as shown in figure 10.5, is enough to remind me that rate comes first, followed by number of periods, and then the principal amount or present value. any function names dis-played in square brackets are optional, so in the example shown in figure 10.5, you know that you may not have to enter anything for fv or type. figure 10.4 the insert function dialog allows you to browse the syntax and descriptions. the help on this function hyperlink leads to more help. tip by the way, you can click the formula tooltip and drag it to a new location on the worksheet. this can be useful if the tooltip is covering cells that you need to click when building the function. if you click on the function name in the tool tip, excel will open help for that function.171 get t ing help wi th excel funct ions 10 chapter as you type each comma in the function, the next argument in the tooltip lights up in boldface. this way, you always know which argument you are entering. using the function arguments dialog when you access a function through the function wizard or a drop-down list, excel displays the function arguments dialog. this dia-log is one of the best features in excel. figure 10.5 the tooltip assists you in remembering the proper order for the arguments. tip if you type functionname( in a cell, you can press ctrla anytime after the opening parenthesis to display the function arguments dialog. figure 10.6 the function arguments dialog helps you build a function, one step at a time.understanding funct ions 172 1 part as shown in figure 10.6, the function arguments dialog has many elements: • the one-sentence description of the function appears in the center of the dialog. • as you tab into the text box for each argument, the description of the argument is shown in the dialog. this description guides you as to what excel is expecting. for example, in the dialog shown in figure 10.6, excel reminds you that the interest rate needs to be divided by four for quarterly payments. this reminds you to divide the apr in cell b3 by 12. • to the right of each argument in the dialog is a reference button. you can click this button to collapse the dialog so you can point to the cells for that argument. • to the right of each text box is a label that shows the result of the entry for that argument. • any arguments in bold are required. arguments not in bold are optional. • after you enter the required arguments, the dialog shows the preliminary result of the formula. this is on the right side, just below the last argument text box. it appears again in the lower-left corner, just above the help on this function hyperlink. • a help on this function hyperlink to the help topic for the function appears in the lower-left corner of the dialog. using excel help the excel help topics for the functions are incredibly complete. you will find the following sections in each function’s help topic: • the function syntax appears at the top of the topic. this includes a description of each function that may be more complete than the description in the function arguments dialog. • the remarks section helps troubleshoot possible problems with the function. it discusses specific limits for each argument and describes the meaning of each possible error that could be returned from the function. • each function has an example section. you can copy an example to a blank worksheet to see the function actually working. • the see also section at the bottom of a help topic allows you to discover related functions. the logical groupings suggested by see also are far more useful than the category groupings in the formulas tab. using autosum microsoft realizes that the most common function is the sum() function. it is so popular that excel provides one-click access to the autosum feature. the autosum icon is the large greek letter sigma that is the second icon on the formulas tab. you can click this icon to use autosum, or you can use the drop-down at the bottom of the icon to access autosum versions of average, count numbers, max, and min, as shown in figure 10.7.173 us ing autosum 10 chapter when you click the autosum button, excel seeks to add up the numbers that are above or to the left of the current cell. in general, when you click the autosum icon, excel guesses which cells you are trying to sum. excel automatically types the sum() formula. you should review excel’s guess to make sure that excel chose the correct range to sum. in figure 10.8, for example, excel correctly guesses that you want to sum the column of revenue above the cell. figure 10.7 the autosum drop-down offers the capability to average and more. figure 10.8 the autosum feature is proposing a formula to sum c2:c10. potential problems with autosum although you should always check the range proposed by the autosum feature, in some cases you should be especially wary. if your headings above the data are numeric, for example, this will fool autosum. in figure 10.9, the 2008 heading in c1 is numeric. this causes excel to include the head-ing incorrectly in the total for column c. tip pressing alt is equivalent to clicking the autosum icon.understanding funct ions 174 1 part when excel proposes the wrong range for a sum, you use your mouse to highlight the correct range before pressing enter. excel avoids including other sum() functions in an autosum range. if a range contains a sum() function that references other cells, excel prematurely stops just before the sum() function. this problem happens only when the sum() function references other cells. if the cell contained 70001878 or h3h4 or sum(7000,1878), autosum would include the cell. excel prefers to sum a column of numbers instead of a row of numbers. figure 10.10 shows a strange anomaly. if you place the cell pointer in cell f2 and click autosum, excel correctly guesses that you want to total b2:e2. cell f3 works fine. however, when you get to cell f4, excel has a choice. there are two numbers above f4 and four numbers to the left of f4. because there are two numbers directly above, excel tries to total those two numbers. this problem seems to happen only in the third row of the data set. after that, excel sees that the three cells above are all summing across the rows, and autosum works perfectly in f5:f11. figure 10.9 numeric headings confuse autosum. figure 10.10 excel can choose between summing two numbers above or four numbers to the left. excel chooses incorrectly. special tricks with autosum there is an amazing trick you can use with autosum. if you select a range of cells before clicking the autosum button, excel does a much better job of predicting what to sum. in figure 10.10, for example, you could select f2:f10 before clicking the autosum button, and excel would know to sum each row. be careful, though, because excel does not preview its guess before175 us ing autosum 10 chapter entering the formula. you should always check a formula after using autosum to make sure the cor-rect range was selected. if your selection contains a mix of blank cells and nonblank cells, excel adds the autosum to only the blank cells. in figure 10.11, for example, you select the range b2:f11 before clicking the autosum button. after you click the autosum button, excel correctly filled in totals for all the rows and columns, as shown in figure 10.12. figure 10.11 if your selection contains a mix of blank and nonblank cells, autosum writes only to the blank cells. figure 10.12 by using autosum, you can add 14 sum() formulas with one click. using the autosum drop-down in legacy versions of excel, the autosum button was flanked by a small drop-down arrow that allowed you to use the autoaverage, automin, automax, and autocount features. in excel 2010, you can still click an arrow to access these features, and the drop-down arrow on the icon is more prominent than it used to be. to see how this works, you can select a cell or range of cells. from the autosum drop-down, you choose another function. excel uses the same guessing logic as with autosum, but it instead enters a formula for average, min, max, or count, as shown in figure 10.13.understanding funct ions 176 1 part using the new general-purpose functions in excel excel 2010 introduces the aggregate() function and includes five functions that were added in excel 2007: iferror(), averageif(), sumifs(), averageifs(), and countifs(). as you will see in the following sections, these functions are a great addition to the excel function set. like subtotal, but better: aggregate() excel 97 had introduced the subtotal function, which would look through the visible rows in a data set and do one of 11 functions. the aggregate() function is similar, with two nice improvements: • aggregate supports 19 functions instead of 11. the first 11 functions are the same as those in the subtotal function: average, count, counta, max, min, product, stdev.s, stdev.p, sum, var.s, var.p. the new functions are median, mode.sngl, large, small, percentile.inc, quartile.inc, percentile.exc, and quartile.exc. • where the subtotal function ignores hidden row and other subtotal functions, the options argument in aggregate allows you to ignore the following: • nothing • hidden rows • error values • both hidden rows and error values • any of the above and other subtotal and aggregate functions figure 10.13 you can use the drop-down on the autosum icon to access the features autoaverage, automin, automax, and autocount.177 us ing the new general -purpose funct ions in excel 10 chapter syntax: aggregate(function_num, options, array, [k]) the addition of the new functions make the aggregate function a powerful addition to the function list. added in excel 2007: iferror() excel errors are not the friendliest things. no one really understands the screaming #value! or #n/a! errors. with the all-capital letters and the exclamation point, they are scary looking. furthermore, the default performance is that one single #n/a error in a column of one million good values causes the total for the column to calculate as #n/a. for example, the vlookup() function is great for converting a number to a name. suppose that your company hires a new employee and you encounter data for the employee before anyone has updated the lookup table. this results in an #n/a! error. you learn more about the vlookup() function in chapter 12 , “using powerful functions: logical, lookup, and database functions,” of this book. you might want the table to include something friendlier than the #n/a! error. perhaps you would like the text new rep to appear instead of the unfriendly #n/a!. there was not an easy way to do this in legacy versions of excel. you had to use a formula such as if(isna(vlookup(a2,a a1:ab99,2,false)),”new rep”, vlookup(a2,aa1:ab99,2,false)). this formula forces excel to use the vlookup() function once, determine if it is an #n/a error, and then calcu-late the vlookup() function again. this is a pain for anyone using excel. if you have to change the vlookup(), you have to change it twice in the formula. also, it takes excel longer to calculate because it potentially has to do each good vlookup() twice. microsoft corrected this problem in excel 2007 by adding the iferror() function. with the iferror() function, you can use vlookup() function once and then provide an alternative value or formula in case the vlookup() function returns an error. syntax: iferror(value,value_if_error) this function returns a value you specify if a formula evaluates to an error; otherwise, it returns the result of the formula. you use the iferror() function to trap and handle errors in a formula. instead of the previous formula, you can use the following: iferror(vlookup(a2,aa1:ab99,2,false), “new rep”)understanding funct ions 178 1 part using conditional formulas with multiple conditions: sumifs() , averageifs() , and countifs() when users see how easy it is to use sumif(), they invariably want the function to do more. one of the most frequent questions at the mrexcel message board is along the lines of, “i am using sumif() to get a total by region. how can i put two conditions in there to only get the total for a certain region and product?” in legacy versions of excel, there were ways to do this, but they were difficult. you had to use either sumproduct() or an array formula. there is a lot of complexity in going from a simple sumif() to the complex boolean logic required to understand sumproduct(). thankfully, beginning with excel 2007, microsoft implemented versions of sumif(), countif(), and averageif() that can handle not just two conditions, but unlimited conditions. three of these newer functions add the letter s to the end of the function name, (sumifs(), countifs(), and averageifs()), to signify that multiple ifs are being considered. with sumifs() and averageifs(), you first specify the range to be summed or averaged. you then specify pairs of arguments. in each pair, you first specify the range to check and then the value to match in that range. syntax: sumifs(sum_range,criteria_range1,criteria1[,criteria_range2, criteria2...] ) syntax: countif s(criteria_range1,criteria1[,criteria_range2, criteria2...] ) syntax: averagiei fs(average_range,criteria_range1,criteria1[,criteria_range2, criteria2...] ) functions with new variations in excel 2010 several functions have new variations in excel 2010. these are situations where microsoft had been calculating the function in one way, whereas industry best practice had evolved to show that the function should be calculated a new way. calculating multiple mode values remember back when you took a stats class? the median is the value in the center of an ordered data set. the mean is the average of the data set. the mode is the value that occurs most often in a data set. caution be careful because the syntax for sumifs is in a different sequence than the sumif func-tion.179 funct ions wi th new var iat ions in excel 2010 10 chapter but, what happens if two numbers are in a tie for the mode? for years, excel returned only the first value when you used the mode function. the old mode function is still in excel with a new name of mode.sngl. excel 2010 offers a new array function called mode.mult. select a vertical range of cells. type mode.mult(d1:d12). hold down ctrlshift while pressing enter. excel returns all of the modes in the data set, followed by #n/a values. calculating workdays legacy versions of excel offered networkdays and workday functions. these functions assumed a workweek of monday through friday. this works fine for the united states, but other countries have workweeks that typically run on other days of the week. excel 2010 introduces networkdays.intl and workday.intl to handle other variations of 5-day or 6-day workweeks. a new weekend value is the third argument in the function; 1 represents a traditional weekend of saturday and sunday; 7 represents a weekend of friday and saturday. double-digit values represent one-day weekends, with 1 being sunday only through 7 being saturday only. handling ties in the rank function suppose you have 10 golfers with their scores. if you have two golfers tied for third place, the old rank function would assign both of those golfers a rank of 3 and no golfer a rank of 4. the rank.eq function is the new name for the old rank function. it continues to calculate as before. excel 2010 introduces a new rank.avg function. in this variant, both golfers tied for third would be assigned a rank of 3.5, because this is the average of the 3 and 4 positions. although rank.avg will make statisticians and scientists happy, it will disappoint excel gurus everywhere who need one person to be ranked 3 and another person to be ranked 4. calculating percentiles and quartiles statisticians disagree with excel’s method for calculating percentiles and quartiles. excel takes the min and max value and interpolates the quartile and percentile. the .inc version of quartile, percentile, and percentrank continue to calculate using the interpolation method as in past versions of excel. the new .exc versions of those functions use the algorithm #6 proposed by weibull and gumbel in hyndman and fan. calculating ceiling and floor for negative values in legacy versions of excel, ceiling(5.1,1) would round up to 6. no one disagrees that this is the right answer. the problem is that ceiling(–5.1,–1) would round to -6. mathematicians argued with that answer. the ceiling function should return a value that is larger than the original value. they argued that if you move up from –5.1, you should end up at –5. microsoft added ceiling.precise to always round up toward positive infinity and floor. precise to always round down toward negative infinity.understanding funct ions 180 1 part functions that have been renamed the statistical functions had been poorly named. depending on the distribution that you are using, the naming conventions varied. for the chi squared distribution, the inverse function was called chiinv. for the binomial distribution, the inverse function was called critbinom. the existing statistical functions have been renamed. watch for these suffixes: • .dist is used for the probability density function and for the left-tailed cumulative distribution function. an argument explains if the function is the pdf or the cdf. • .inv is used for the inverse cumulative distribution function. • .rt is used for right-tailed. • .lt is used for left-tailed. • .test is used for hypothesis testing functions. • .p is used for functions based on a population. • .s is used for functions based on a sample. for the distribution function, the new names are shown in table 10.1. table 10.1 new distribution function names distribution pdf/cdf right-tailed cdf inverse left-tailed cdf inverse right-tailed cdf beta beta.dist beta.inv binomial binom.dist binom.inv chi-squared chisq.dist chisq.dist.rt chisq.inv chisq.inv.rt exponential expon.dist f f.dist f.dist.rt f.inv f.inv.rt gamma gamma.dist gamma.inv hypergeometric hypgeom.dist logonormal lognorm.dist lognorm.inv negative binomial negbinom.dist normal norm.dist norm.inv standard normal norm.s.dist norm.s.inv poisson poisson.dist181 funct ions that have been renamed 10 chapter distribution pdf/cdf right-tailed cdf inverse left-tailed cdf inverse right-tailed cdf student’s t t.dist t.dist.rt t.inv student’s t (2 tailed) t-dist.2t t.inv.2t weibull weibull.dist for hypothesis testing, the functions are f.test, t.test, and z.test. confidence tests are confidence.norm for the normal distribution and confidence.t for the student’s t distribution. variance and standard deviation have always been available as functions for a sample (var, stdev) and a population (varp and stdevp). microsoft renamed these to be var.s, stdev.s, var.p, stdev.p. microsoft also formalized the fact that the old covar function is based on a population by renaming it to covariance.p, and it added a sample version named covariance.s. using worksheets with legacy function names with the new naming scheme, many functions are in excel 2010 twice. the new var.p function will work, but microsoft has to support the old var function because millions of legacy spreadsheets exist that have been using var. further, if you are creating a new workbook in excel 2010 and that workbook will be shared with people who are using excel 2007 or earlier, you pretty much have to keep using the old naming con-vention. still, microsoft is hopeful that people will start using the new naming scheme. in the statistical function drop-down, the new names appear. the old names have been moved to the compatibility drop-down. as you start to type a function name, the formula autocomplete always lists the new names first. the old names are listed at the end with a symbol to indicate that it is here for compatibility only (see figure 10.14). figure 10.14 the old functions are listed at the end of the autocomplete list with an exclamation symbol.understanding funct ions 182 1 part table 10.2 compares the old and new function names. table 10.2 comparison of old and new function names old new betadist beta.dist betainv beta.inv binomdist binom.dist chidist chisq.dist chiinv chisq.inv chitest chisq.test confidence confidence.norm covar covariance.p critbinom binom.inv expondist expon.dist fdist f.dist finv f.inv ftest f.test gammadist gamma.dist gammainv gamma.inv hypgeomdist hypgeom.dist loginv lognorm.inv lognormdist lognorm.dist mode mode.sngl negbinomdist negbinom.dist normdist norm.dist norminv norm.inv normsdist norm.s.dist normsinv norm.s.inv percentile percentile.inc183 cube funct ions int roduced in excel 2007 10 chapter old new percentrank percentrank.inc poisson poisson.dist quartile quartile.inc rank rank.eq stdev stdev.s stdevp stdev.p tdist t.dist tinv t.inv ttest t.test var var.s varp var.p weibull weibull.dist ztest z.test cube functions introduced in excel 2007 if you are creating pivot tables from olap data or from powerpivot, excel offers a convert to formulas command. excel uses a series of cube functions to retrieve the data cells from the data source. the following functions were introduced in excel 2007. syntax: cubemember(connection,member_expression[,caption]) the cubemember() function returns a member or tuple in a cube hierarchy. you can use it to validate that the member or tuple exists in the cube. syntax: cubememberproperty(connection,member_expression,property) the cubememberproperty() function returns the value of a member property in the cube. you can use it to validate that a member name exists within the cube and to return the specified property for that member.understanding funct ions 184 1 part syntax: cuberankedmember(connection,set_expression,rank[,caption]) the cuberankedmember() function returns the nth, or ranked, member in a set. you can use it to return one or more elements in a set, such as the top sales performer or top 10 students. syntax: cubeset(connection,set_expression[,caption][,sort_order][,sort_by]) the cubeset() function defines a calculated set of members or tuples by sending a set expression to the cube on the server, which creates the set. the function then returns that set to excel. you can use cubeset() to build dynamic reports that aggregate and filter data, by using the return value as a slicer in the cubevalue() function, the cuberankedmember() function to choose specific mem-bers from the calculated set, and the cubesetcount() function to control the size of the set. syntax: cubesetcount(set) the cubesetcount() function returns the number of items in a set. syntax: cubevalue(connection,member_expression1[,member_expression2...]) the cubevalue() function returns an aggregated value from a cube. syntax: cubekpimember(connection,kpi_name,kpi_property[,caption]) the cubekpimember() function returns a key performance indicator (kpi) name, property, and measure, and it displays the name and property in the cell. a kpi is a quantifiable measurement, such as monthly gross profit or quarterly employee turnover, used to monitor an organization’s performance. using the former atp functions in legacy versions of excel, a variety of functions were known as the atp functions. these functions were available only on computers that had enabled the analysis toolpak (atp) add-in. even if you had enabled the atp, there was a danger that you could send a workbook to a co-worker who had not enabled the atp. if this happened, all the formulas that used any of the 89 atp functions would change to #name ? errors. note to use this function, you must have sql server analysis services 2005 or later.185 cube funct ions int roduced in excel 2007 10 chapter although it was easy to enable the atp in legacy versions of excel, there was a great deal of para-noia about using these functions, sending the workbook out, and having others obtain the wrong results. for this reason, some companies instituted policies against using the atp functions. people would write elaborate formulas to duplicate what could easily be done with the atp. in a smart move, microsoft has promoted the 89 atp functions to be part of the standard excel package starting with excel 2007. this means that you can safely share an excel workbook with any other person using excel 2007 or later, and all the functions will continue to work. however, be aware that if you are sharing a workbook with a person who is using a legacy version of excel, the functions in the atp may change to #name? errors on the other computer. the individual functions are covered in the next five chapters of this book. however, the following alphabetical list is provided as a guide about which functions are potentially problems when sharing with people using legacy versions of excel. see the “function reference chapters” section of this chapter to learn more about the functions covered in chapters 11 to 15 of this book. table 10.3 lists the functions that used to be included in the atp but are now a default part of excel. table 10.3 functions that are now a default part of excel accrint() accrintm() amordegrc() amorlinc() besseli() besselj() besselk() bessely() bin2dec() bin2hex() bin2oct() complex() convert() coupdaybs() coupdays() coupdaysnc() coupncd() coupnum() couppcd() cumipmt() cumprinc() dec2bin() dec2hex() dec2oct() delta() disc() dollarde() dollarfr() duration() edate() effect() eomonth() erf() erfc() error.type() factdouble() fvschedule() gcd() gestep() hex2bin() hex2dec() hex2oct() imabs() imaginary() imargument() imconjugate() imcos() imdiv()understanding funct ions 186 1 part imexp() imln() imlog10() imlog2() impower() improduct() imreal() imsin() imsqrt() imsub() imsum() intrate() iseven() isodd() lcm() mduration() mround() multinomial() networkdays() nominal() oct2bin() oct2dec() oddfprice() oddfyield() oddlprice() oddlyield() price() pricedisc() pricemat() quotient() randbetween() received() seriessum() sql.request() sqrtpi() tbilleq() tbillprice() tbillyield() weeknum() workday() xirr() xnpv() yearfrac() yield() yielddisc() yieldmat(). function reference chapters chapters 11 through 15 provide a fairly comprehensive reference for most of the 400 functions in excel 2010. at the beginning of each of these chapters is an alphabetical list of the functions described, along with arguments and a short description of each function. following the alphabeti-cal list are examples of how to use the functions. these examples describe all the required argu-ments. the examples are designed to give you ideas of how to use the functions in real life. function coverage is broken out as follows: • chapter 11, “using everyday functions: math, date and time, and text functions,” describes functions that many people encounter in their everyday life: some of the math functions, date functions, and text functions. • chapter 12, “using powerful functions: logical, lookup, and database functions,” describes functions that are a bit more difficult but that should be a part of your everyday arsenal. these include a series of functions for making decisions in a formula. they include the if function and are known collectively as the logical functions. chapter 12 also describes the information, lookup, and database functions.187 cube funct ions int roduced in excel 2007 10 chapter • chapter 13, “using financial functions,” describes the financial functions. the first section of the chapter includes functions that anyone can use to calculate a car loan or plan for retirement. the later sections of the chapter include functions for depreciation, business valuation, and bond investing. • chapter 14, “using statistical functions,” describes statistical functions. many of these functions are functions that are useful everyday (for example, average(), max(), min(), rank()). the chapter also describes many highly specialized functions that are useful to scientists and engi-neers. • chapter 15, “using trig, matrix, and engineering functions,” describes trigonometry and engi-neering functions. the trigonometry functions are grouped along with the other math functions in the math functions icon, but they are described separately in this book because they are more specialized. the engineering functions are highly specialized.this page intentionally left blank11 using everyday functions: math, date and time, and text functions excel offers many functions for dealing with basic math, dates and times, and text. this chapter describes the functions found under the date & time icon and the math & trig icon on the formulas tab. a few of the new functions in excel 2010 fall into this chapter: • aggregate—provides a way to ignore error values, other subtotals, and/or filtered rows in 17 other functions. whereas subtotal provided a way to ignore rows hidden in a filter for 11 functions, the aggregate function adds median, mode, percentile, large, small, and quartile functions. also, the ability to use an array gives you the possibility of extending the sumifs concept to min, max, median, and so on. • ceiling.precise—corrects the way that microsoft implemented the ceiling function for negative numbers. • networkdays.intl—extends the functionality of networkdays to compa-nies in which the weekend is a pair of days other than saturday and sunday. • workday.intl—extends the functionality of workday to calendars in which the weekend is a pair of days other than saturday and sunday. table 11.1 provides an alphabetical list of all of excel 2010’s math functions. detailed examples of these functions are provided later in this chapter.us ing everyday funct ions: math, date and time, and text funct ions 190 1 part table 11.1 alphabetical list of math functions function description abs(number) returns the absolute value of a number. the absolute value of a number is the number without its sign. aggregate(function, options, array, [k]) performs one of 17 functions with the ability to ignore error values, other subtotals, and/or rows hidden by a filter. new in excel 2010. ceiling(number,significance) returns the number rounded up, away from zero, to the nearest multiple of significance. for example, if you want to avoid using pennies in your prices and your product is priced at 4.42, you can use the formula ceiling(4.42,0.05) to round prices up to the nearest nickel. note that excel calculates ceiling(-2.1,-1) as -3, which is different than the iso standard. see ceiling.precise for an alternative. ceiling.precise(number,significance) rounds a number up to the nearest multiple of sig-nificance. new in excel 2010 to provide compatibility with the iso standard for computing the ceiling of a negative number. combin(number,number_chosen) returns the number of combinations for a given number of items. you use combin to determine the total possible number of groups for a given number of items. countif(range,criteria) counts the number of cells within a range that meet the given criteria. even(number) returns number rounded up to the nearest even inte-ger. you can use this function for processing items that come in twos. for example, suppose a packing crate accepts rows of one or two items. the crate is full when the number of items, rounded up to the nearest two, matches the crate’s capacity. exp(number) returns e raised to the power of number. the con-stant e equals 2.71828182845904, the base of the natural logarithm. fact(number) returns the factorial of a number. the factorial of a number is equal to 1 2 3 ... number. factdouble(number) returns the double factorial of a number. floor(number,significance) rounds the number down, toward zero, to the near-est multiple of significance.191 us ing everyday funct ions: math, date and time, and text funct ions 11 chapter function description gcd(number1,number2,...) returns the greatest common divisor of two or more integers. the greatest common divisor is the largest integer that divides both number1 and number2 with-out a remainder. int(number) rounds a number down to the nearest integer. int(number) rounds a number down to the nearest integer. lcm(number1,number2,...) returns the least common multiple of integers. the least common multiple is the smallest positive integer that is a multiple of all integer arguments number1, number2, and so on. you use lcm to add fractions that have different denominators. mod(number,divisor) returns the remainder after number is divided by divisor. the result has the same sign as divisor. mround(number,multiple) returns a number rounded to the desired multiple. multinomial(number1,number2,...) returns the ratio of the factorial of a sum of values to the product of factorials. odd(number) returns number rounded up to the nearest odd integer. pi() returns the number 3.14159265358979, the math-ematical constant pi, accurate to 15 digits. power(number,power) returns the result of a number raised to a power. product(number1,number2,...) multiplies all the numbers given as arguments and returns the product. quotient(numerator,denominator) returns the integer portion of a division operation. you use this function when you want to discard the remainder of a division. rand() returns an evenly distributed random number greater than or equal to 0 and less than 1. a new ran-dom number is returned every time the worksheet is calculated. randbetween(bottom,top) returns a random number between the numbers specified. a new random number is returned every time the worksheet is calculated. roman(number,form) converts an arabic numeral to roman, as text. round(number,num_digits) rounds a number to a specified number of digits. rounddown(number,num_digits) rounds a number down, toward zero.us ing everyday funct ions: math, date and time, and text funct ions 192 2 part function description roundup(number,num_digits) rounds a number up, away from zero. sign(number) determines the sign of a number. returns 1 if the number is positive, 0 if the number is 0, and -1 if the number is negative. sqrt(number) returns a positive square root. sqrtpi(number) returns the square root of (number pi). subtotal(function_num, ref1,ref2,...) returns a subtotal in a list or database. it is gener-ally easier to create a list with subtotals by using the subtotals command (from the data menu). after the subtotal list is created, you can modify it by editing the subtotal function. sum(number1,number2,...) adds all the numbers in a range of cells. sumif(range,criteria,sum_range) adds the cells specified by the given criteria. sumproduct(array1,array2, array3,...) multiplies corresponding components in the given arrays and returns the sum of those products. trunc(number,num_digits) truncates a number to an integer by removing the fractional part of the number. table 11.2 provides an alphabetical list of all of excel 2010’s date and time functions. detailed exam-ples of these functions are provided later in this chapter. table 11.2 alphabetical list of date and time functions function description date (year,month,day) returns the serial number that represents a particular date. datedif (start_date,end_date,unit) calculates the number of days, months, or years between two dates. this function is provided for compatibility with lotus 1-2-3. datevalue (date_text) returns the serial number of the date represented by date_text. you use datevalue to convert a date represented by text to a serial number. day (serial_number) returns the day of a date, represented by a serial number. the day is given as an integer ranging from 1 to 31.193 us ing everyday funct ions: math, date and time, and text funct ions 11 chapter function description days360 (start_date,end_ date,method) returns the number of days between two dates, based on a 360-day year (that is, 12 30-day months), which is used in some accounting calculations. you use this function to help compute payments if your accounting system is based on 12 30-day months. edate (start_date,months) returns the serial number that represents the date that is the indicated number of months before or after a specified date (that is, the start_date). you use edate to calculate maturity dates or due dates that fall on the same day of the month as the date of issue. eomonth (start_date,months) returns the serial number for the last day of the month that is the indicated number of months before or after start_date. you use eomonth to calculate maturity dates or due dates that fall on the last day of the month. hour (serial_number) returns the hour of a time value. the hour is given as an integer, ranging from 0 (12:00 a.m.) to 23 (11:00 p.m.). minute (serial_number) returns the minutes of a time value. the minutes are given as an integer, ranging from 0 to 59. month (serial_number) returns the month of a date represented by a serial number. the month is given as an integer, ranging from 1 (for january) to 12 (for december). networkdays (start_date,end_date, holidays) returns the number of whole working days between start_date and end_date. working days exclude weekends and any dates identified in holidays. you use networkdays to calculate employee benefits that accrue based on the number of days worked during a specific term. weekdays are defined as saturday and sunday. to handle other calendars, see networkdays.intl. networkdays.intl (start_date,end_ date, weekend, holidays) returns the number of whole working days between start date and end date. added in excel 2010 to sup-port calendars in which the weekend is a pair of days other than saturday and sunday. now () returns the serial number of the current date and time. second (serial_number) returns the seconds of a time value. the seconds are given as an integer in the range 0 to 59.us ing everyday funct ions: math, date and time, and text funct ions 194 2 part function description time (hour,minute,second) returns the decimal number for a particular time. the decimal number returned by time is a value ranging from 0 to 0.99999999, representing the times from 0:00:00 (12:00:00 a.m.) to 23:59:59 (11:59:59 p.m.). timevalue (time_text) returns the decimal number of the time represented by a text string. the decimal number is a value rang-ing from 0 to 0.99999999, representing the times from 0:00:00 (12:00:00 a.m.) to 23:59:59 (11:59:59 p.m.). today () returns the serial number of the current date. the serial number is the date/time code that microsoft excel uses for date and time calculations. weekday (serial_number,return_type) returns the day of the week corresponding to a date. the day is given as an integer, ranging from 1 (for sunday) to 7 (for saturday), by default. weeknum (serial_num,return_type) returns a number that indicates where the week falls numerically within a year. workday (start_date,days,holidays) returns a number that represents a date that is the indicated number of working days before or after a date (the starting date). working days exclude weekends and any dates identified as holidays. you use workday to exclude weekends or holidays when you calculate invoice due dates, expected delivery times, or the number of days of work performed. to view the number as a date, format the cell as a date. weekends are defined as saturday and sunday. for alternative calendars, see workday.intl. workday.intl (start_date,days, weekend,holidays) returns a number that represents a date that is the indicated number of working days before or after a starting date. added to excel 2010 to accommodate calendar systems where the weekend is a pair of days other than saturday and sunday. year (serial_number) returns the year corresponding to a date. the year is returned as an integer in the range 1900 through 9999. yearfrac (start_date,end_ date,basis) calculates the fraction of the year represented by the number of whole days between two dates (start_ date and end_date). you use the yearfrac work-sheet function to identify the proportion of a whole year’s benefits or obligations to assign to a specific term. table 11.3 provides an alphabetical list of all of excel 2010’s text functions. detailed examples of these functions are provided later in this chapter.195 us ing everyday funct ions: math, date and time, and text funct ions 11 chapter table 11.3 alphabetical list of text functions function description asc (text) changes full-width (double-byte) english letters or katakana within a character string to half-width (single-byte) characters. bahttext (number) converts a number to thai text and adds the suffix baht. this function was new in excel xp. char (number) returns the character specified by number. you use char to translate code page numbers you might get from files on other types of computers into characters. clean (text) removes all nonprintable characters from text. you use clean on text imported from other applications that contains characters that may not print with your operat-ing system. for example, you can use clean to remove some low-level computer code that is frequently at the beginning and end of data files and cannot be printed. code (text) returns a numeric code for the first character in a text string. the returned code corresponds to the character set used by your computer. concatenate (text1,text2,...) joins several text strings into one text string. dollar (number,decimals) converts a number to text using currency format, with the decimals rounded to the specified place. the format used is #,##0.00_);(#,##0.00). exact (text1,text2) compares two text strings and returns true if they are the same, and false otherwise. exact is case-sensitive but ignores formatting differences. you use exact to test text being entered into a document. find (find_text,within_text, start_num) finds one text string (find_text) within another text string (within_text) and returns the number of the starting position of find_text, from the first character of within_text. you can also use search to find one text string within another, but unlike search, find is case-sensitive and doesn’t allow wildcard characters. findb (find_text,within_ text,start_num) finds one text string (find_text) within another text string (within_text) and returns the number of the starting position of find_text, based on the number of bytes each character uses, from the first character of within_text. you use findb with double-byte charac-ters. you can also use searchb to find one text string within another. fixed (number,decimals,no_commas) rounds a number to the specified number of decimals, formats the number in decimal format using a period and commas, and returns the result as text.us ing everyday funct ions: math, date and time, and text funct ions 196 2 part function description jis (text) changes half-width (single-byte) english letters or katakana within a character string to full-width (double-byte) characters. left (text,num_chars) returns the first character or characters in a text string, based on the number of characters specified. leftb (text,num_bytes) returns the first character or characters in a text string, based on the number of bytes specified. you use leftb with double-byte characters. len (text) returns the number of characters in a text string. lenb (text) returns the number of bytes used to represent the char-acters in a text string. you use lenb with double-byte characters. lower (text) converts all uppercase letters in a text string to lower-case. mid (text,start_num,num_chars) returns a specific number of characters from a text string, starting at the position specified, based on the number of characters specified. midb (text,start_num,num_bytes) returns a specific number of characters from a text string, starting at the position specified, based on the number of bytes specified. you use midb with double-byte characters. phonetic (reference) extracts the phonetic (furigana) characters from a text string. furigana are a japanese reading aid. they con-sist of smaller kana printed next to a kanji to indicate its pronunciation. proper (text) capitalizes the first letter in a text string and any other letters in text that follow any character other than a let-ter. converts all other letters to lowercase. replace (old_text,start_num, num_ chars,new_text) replaces part of a text string, based on the number of characters specified, with a different text string. replaceb (old_text,start_num,num_ bytes,new_text) replaces part of a text string, based on the number of bytes specified, with a different text string. you use replaceb with double-byte characters. rept (text,number_times) repeats text a given number of times. you use rept to fill a cell with a number of instances of a text string. right (text,num_chars) returns the last character or characters in a text string, based on the number of characters specified.197 examples of math funct ions 11 chapter function description rightb (text,num_bytes) returns the last character or characters in a text string, based on the number of bytes specified. you use rightb with double-byte characters. search (find_text,within_text, start_num) returns the number of the character at which a specific character or text string is first found, beginning with start_num. you use search to determine the location of a character or text string within another text string so that you can use the mid or replace functions to change the text. searchb (find_text,within_text, start_num) finds one text string (find_text) within another text string (within_text) and returns the number of the starting position of find_text. the result is based on the number of bytes each character uses, beginning with start_num. you use searchb with double-byte characters you can also use findb to find one text string within another. substitute (text,old_text,new_ text,instance_num) substitutes new_text for old_text in a text string. you use substitute when you want to replace specific text in a text string; you use replace when you want to replace any text that occurs in a specific location in a text string. t (value) returns the text referred to by value. text (value,format_text) converts a value to text in a specific number format. trim (text) removes all spaces from text except for single spaces between words. you use trim on text that you have received from another application that may have irregu-lar spacing. upper (text) converts text to uppercase. value (text) converts a text string that represents a number to a number. yen (number,decimals) converts a number to text, using the japanese yen cur-rency format, with the number rounded to a specified place. examples of math functions the most common formula in excel is a formula to add a column of numbers. in addition to sum, excel offers a variety of mathematical functions.us ing everyday funct ions: math, date and time, and text funct ions 198 2 part using sum to add numbers the sum function is by far the most commonly used function in excel. this function can add num-bers from one or more ranges of data. syntax: sum( number1,number2,... ) the sum function adds all the numbers in a range of cells. the arguments number1, number2,... are 1 to 255 arguments for which you want the total value or sum. a typical use of this function is sum(b4:b12). it is also possible to use sum(1,2,3). in the latter example, you cannot specify more than 255 individual values. in the former example, you can specify up to 255 ranges, each of which can include thousands of cells. in figure 11.1, cell b25 contains a formula to sum three individual cells: sum(b17,b19,b23). figure 11.1 a variety of sum formulas. it is unlikely that you will need more than 255 arguments in this function, but if you do, you can group arguments in parentheses. for example, sum((a10,a12),(a14,16)) would count as only 2 of the 255 allowed arguments. if a text value that looks like a number is included in a range, the text value is not included in the result of the sum. strangely enough, if you specify the text value directly as an argument in the function, excel does add it to the result. for example, sum(1,2,“3”) will be 6, yet sum(d4:d6) in figure 11.1 will result in 3. the comma is treated as a union operator. if you replace the comma with a space, excel will find the cells that fall in the intersection of the selected ranges. in cell d17, the formula of sum(f13:h14 g12:g15) adds up the two cells that are in common between the two ranges. if one cell in a referenced range contains an error, the result of the sum function is an error. to add numbers while ignoring error cells, use the new aggregate function. it is valid to create a spearing formula. this type of formula adds the identical cell from many work-sheets. for example, sum(jan:dec!b20) would add cell b20 on all 12 sheets between january199 examples of math funct ions 11 chapter and december. if the sheet names contain spaces or other nonalphabetic characters, surround the sheet names with apostrophes: sum(‘jan 2011:dec 2011’!b20). to quickly enter a sum formula, you can press alt or click the autosum icon on the formulas tab. in figure 11.2, pressing the autosum icon will add totals to the 13 selected blank cells all at once (see figure 11.3). autosum button figure 11.2 the autosum icon (the greek letter sigma) adds sum formulas to all the selected cells at once. figure 11.3 after clicking autosum, the total formulas are automatically entered. using aggregate to ignore error cells or filtered rows one of the best new functions in excel 2010 is the aggregate function. this one function lets you perform 17 functions on a range of data while selectively ignoring error cells and/or rows hidden by a filter.us ing everyday funct ions: math, date and time, and text funct ions 200 2 part syntax: aggregate( function_num, options, array, [k] ) syntax: aggregate( function_num, options, ref1, ref2, ... ) the options argument is the interesting feature of the new function. you can choose to ignore any, all, or none of these categories: • error values • hidden rows • other subtotal and aggregate functions on one hand, the capability to ignore filtered rows and other aggregate functions is similar to the subtotal function. the capability of aggregate to ignore error values as well solves a common problem in the sum, count, and other functions. usually, a single #n/a error cell in a range will cause most functions to return an #n/a error. the options in aggregate allow you to ignore any error cells in the range. the options argument controls which values are ignored. this is a simple binary system, as follows: • to ignore other subtotals, add zero. to include subtotals, add 4. • to ignore hidden rows, add 1. • to ignore error values, add 2. • thus, to ignore other subtotals, hidden rows, and error values, you would specify 3 (012) as the option argument. • to ignore error values but include other subtotal values, you would specify 5 (14) as the argument. this calculation works out as shown in table 11.4. table 11.4 arguments for aggregate function 0 ignore other subtotals 1 ignore hidden rows & subtotals 2 ignore error cells & subtotals 3 ignore all three 4 ignore nothing 5 ignore hidden rows201 examples of math funct ions 11 chapter 0 ignore other subtotals 6 ignore error cells 7 ignore hidden rows & error cells in figure 11.4, the #n/a error in cell f13 causes the sum function in f18 to also return an #n/a . if you use a 2, 3, 5, or 7 as the second argument of aggregate, you can easily sum all the other numbers as in cell f1. you can also use other function numbers to calculate min, max, count, median, mode, percentile, and quartile values. to see a demo of using aggregate, search for “excel in depth 11” at youtube. figure 11.4 using a 2 or 3 as the options argument for aggregate will allow the function to ignore error cells in a range. the function can also be used to ignore cells hidden by a filter. whereas the old subtotal function would allow you to do this for eleven calculation functions, the aggregate function adds eight new functions to the list. table 11.5 shows the 19 functions available in the aggregate function. this list mirrors the 11 func-tions available in subtotal (arranged alphabetically to match those in the subtotal function) and then eight new functions arranged in order of popularity.us ing everyday funct ions: math, date and time, and text funct ions 202 2 part table 11.5 functions available in aggregate fx # function 1 average 2 count 3 counta 4 max 5 min 6 product 7 stddev.s 8 stddev.p 9 sum 10 var.s 11 var.p 12 median 13 mode.sngl 14 large 15 small 16 percentile.inc 17 quartile.inc 18 percentile.exc 19 quartile.exc the last six functions in this list require you to specify a value for “k” as the fourth argument. large and small will typically return the kth largest or smallest value from a list. use the fourth argument in aggregate to specify the value for k. in cell f3 of figure 11.4, the final argument of 3 specifies that you want the third smallest number in the array. for large, small, and quartile, you should specify an integer for k. for percentile, specify a decimal between 0 and 1. when you are trying to return results from the visible rows of a filtered data set, you can use either subtotal or aggregate. in figure 11.5, the sum function in d1 returns the sum of the visible and hidden rows. the subtotal function in d2 returns the sum of the visible rows, the same as the aggregate function in d3. the advantage of aggregate is that it can return median, large, small, percentile, and quartile on the visible rows as well.203 examples of math funct ions 11 chapter using count or counta to count numbers or nonblank cells a number of functions process nonblank cells. count counts all the numeric or date cells in a range. counta counts all the nonblank cells in a range. syntax: count( value1,value2,... ) the count function counts the number of cells that contain num-bers and also numbers within the list of arguments. you use count to get the number of numeric entries in a range or array. t ip simulating largeifs, percentileifs, quartileifs the function arguments of 13 through 19 allow the array to be calculated on-the-fly. the formula in b20 of figure 11.4 is a wild, over-the-top formula that seems like it would have come from the excel gurus gone wild book that i compiled in 2008. the goal is to find records that match multiple criteria and then to apply the large function to those matching records. the array argument starts out with the sales amounts in d9:d16. but then the sales amounts are divided by a boolean expression. (b9:b16”east”) checks to see if the record is in the east region. (c9:c16) checks to see if the rep is r1. when you multiply these two conditions together, you will get an array of 1s and 0s. the 1 indicates both conditions are true. in the figure, the result would be {0;1;0;0;0;0;1;0}. when the formula evaluates the sales amounts divided by the array of 1s and 0s, you either end up with the sales amount or a division by zero error. in the figure, the result of 16000 divided by 0 is #div/0!. the result of 1000 divided by 1 is 1000, and so on. the array contains mostly #div/0! errors and a few actual numbers. because the aggregate function has an option to ignore error values, the result is that the function simulates doing large with multiple conditions. figure 11.5 aggregate will perform calculations on the visible items of a filtered data set. caution count and counta are found in the statistical drop-down under the more functions icon of the formulas tab.us ing everyday funct ions: math, date and time, and text funct ions 204 2 part the arguments value1, value2,... are 1 to 255 arguments that can contain or refer to a variety of types of data, but only numbers are counted. note that whereas a single error cell in a range causes the sum function to return an error, the same condition is ignored in the count function. count(1,2,“3”) results in the text entry being counted. if you refer to a range that contains text that looks like a number, the text is not included in the count. syntax: counta( value1,value2,... ) counta counts the number of cells that are not empty and the values within the list of arguments. you use counta to count the number of cells that contain data in a range or an array. the arguments value1, value2,... are 1 to 255 arguments representing the values you want to count. in this case, a value is any type of information, including empty text (“”) but not including empty cells. if an argument is an array or a reference, empty cells within the array or reference are ignored. if you do not need to count logical values, text, or error values, you should use the count function. note that error cells are included in the results from counta. choosing between count and counta the key to choosing between count and counta is to analyze the data that you want to count. in figure 11.6, someone has used the letter x in column b to indicate that training has been started. in this case, you would use counta to get an accurate count. column c contains dates (which are treated as numeric). in column c, either count or counta returns the correct result. column d has a mix of text and numeric entries. if you want to count how many people took the test, use counta. if you want to count how many people received a numeric score, use count. using round, rounddown, roundup, int, trunc, floor, floor.precise, ceiling, ceiling.precise, even, odd, or mround to remove decimals or round numbers a variety of functions—including round, rounddown, roundup, int, trunc, floor, floor. precise, ceiling, ceiling.precise, even, odd, and mround—can be used to round a result or to remove decimals from a result. syntax: trunc( number ), int( number ), even( number ), and odd( number ) caution using more than 30 arguments in count or counta causes backward compatibility problems with legacy versions of excel.205 examples of math funct ions 11 chapter the trunc, int, even, and odd functions always change a number to an integer. the syntax in each case is similar: the function accepts a single number or a single cell containing a number. to remove the decimals from a result, use the trunc function. this truncates a number to the integer portion of the number. for example, trunc(1.9) is 1, and trunc(–1.9) is –1. to remove the decimals from a result and always round down to the next lowest integer, use int. for positive numbers, trunc and int return identical values. a subtle difference exists between trunc and int. when you have a negative number, int rounds away from zero to produce the next lowest integer. thus, int(–1.1) is –2. even rounds a number away from zero to the next even integer. for example, even(3) is 4, and even(-3) is –4. if the number is already an even integer, no adjustment is made; for example, even(6) is 6. this function is ideal for ordering products packed two to a case. odd rounds a number away from zero to the next odd integer. for example, odd(1.1) is 3, and odd(-3.1) is –5. if the number is already an odd integer, no adjustment is made. figure 11.7 compares the results of trunc, int, even, and odd. figure 11.6 whether you use count or counta depends on whether your data is numeric. count counts only dates and numeric entries. counta counts any-thing that is nonblank.us ing everyday funct ions: math, date and time, and text funct ions 206 2 part syntax: round( number,num_digits ), roundup( number,num_digits ), and rounddown ( number,num_digits ) three more functions—round, roundup, and rounddown—round a number to a specified number of decimal places. they all take the following arguments: • number—this is the number you want to round. • num_digits —this specifies the number of digits to which you want to round number. with round, if the number of digits is zero, the number is rounded to the nearest integer, following these rules: • values up to 0.49 are rounded toward zero. for example, round(1.49999,0) results in 1, and round(-1.49999,0) results in –1. • values of 0.5 and above are rounded away from zero. for example, round(1.5,0) results in 2, and round(–1.5,0) results in –2. if the num_digits is positive, the number is rounded to have the specified number of decimal places. if the number of digits is negative, the number is rounded to the left of the decimal point. for example, round(117,–1) is rounded to the nearest 10, or a value of 120. to override the rounding rules, you can use rounddown or roundup: • the rounddown function always rounds toward zero. for example, rounddown(1.999,0) rounds to 1, and rounddown(-19.999,0) rounds to –19. you might use this function when judging a contest in which if the entrant does not completely finish a task, he or she does not get credit for the unfinished portion of the task. figure 11.7 trunc and int are nearly identical, except when the numbers become negative. n ote using a negative number for the number of digits provides an interesting result. if you need to round a number to the near-est thousand, you can indicate that it should be rounded to –3 decimal places. for example, round(1,234,567,-3) would be 1,235,000.207 examples of math funct ions 11 chapter • the result of the roundup function always rounds away from zero. for example, roundup(1.01,0) rounds up to 2, and roundup(-1.01,0) rounds to –2. you might use this function when calculating prices because if the customer uses any fractional portion of a product, he or she is charged for the complete product. figure 11.8 compares round, roundup, and rounddown. figure 11.8 these three functions always round to a power of 10. syntax: mround( number,multiple ), ceiling( number,significance ), ceiling.precise( number,[ significance] ), floor( number,significance ), and floor.precise( number,[significa nce] ) c aution if you remember how they taught you to round in school, you know the rule that numbers ending in 0.5 should always round up. the excel developers must have sat through the same curriculum as you and i did, because they implemented rounding in this manner. however, if you have a large amount of data points that end in 0.5, you will introduce a fair amount of error in the data by using the method that we learned in school. in figure 11.9 , a million data points end with a single decimal place. comparing the total of the points and the total of the round of the data points shows a delta of 9 hundredths of a percent. in this example, the rounded values total to 52,077 more than the original values. a set of rules known as bankers rounding or astm e29 rounding prescribes that values ending in 0.5 should always be rounded to the nearest even integer. thus, 1.5 would round up to 2 and 2.5 would round down to 2. column d of figure 11.9 contrasts the bankers rounding method with the regular rounding method. this formula produces a result that is 91 times more accurate for this data set. the rounded values in column d are within 9.9 ten-thousandths of a percent for a total error of only 572 over the million rows of data.us ing everyday funct ions: math, date and time, and text funct ions 208 2 part the last five functions in this group—mround, ceiling, ceiling.precise, floor, and floor.precise—round a number to a certain multiple. they require you to enter the number and the multiple to which to round. they all take the following arguments: • number—this is the number you want to round. • multiple or significance—this is the nearest multiple that you want to round toward. note that if number is negative, multiple or significance must also be negative. suppose that you handle pricing for a line of products. your general rule is to mark up the product cost, which results in a series of strange prices, such as 185.9375, as shown in figure 11.10. to round each price to the nearest increment of 5, you would use mround(c2,5). you could also use mround to round to the nearest quarter: mround(c2,0.25). the multiple argument in mround is allowed to be negative. figure 11.9 round will skew your data. figure 11.10 mround rounds a price to a certain multiple. here, column d is the cal-culated prices rounded to the nearest 5. in other situations, you may want to round a number up to a certain multiple. figure 11.11 shows a requisition list. column a shows the quantity needed, and column b shows the item. the purchas-ing agent discovered a vendor who offers a significant discount, but only if you buy in complete case quantities. column c shows the size of the case for each product. to calculate the total number to order, you need to round a number in column a up to the nearest multiple of the case size found in column c. you use ceiling(a4,c4) to achieve this effect.209 examples of math funct ions 11 chapter ceiling rounds away from zero. if you use ceiling(–9,–6), the function rounds –9 to –12. the iso standard for calculating ceiling disagrees with excel’s calculation of ceiling for negative numbers. in excel, the ceiling(–2.5,–1) would round the –2.5 to a lower number of –3. the iso standard says that ceiling should always round up. if you are at –2.5, the next higher value is actually –2. excel 2010 introduces the new ceiling.precise function. here are the differences between ceiling and ceiling.precise: • the significance argument is optional in ceiling.precise. when omitted, the significance is assumed to be 1. • ceiling.precise(–2.5) now rounds up to –2 instead of –3. • in excel 2007, the number and significance had to have the same sign. previously, ceiling(–2,1) or ceiling(2,–1) would have evaluated to a #num error. in excel 2010, either ceiling.precise(-2.5,1) or ceiling. precise(2.5,–1) calculates without a problem. excel 2010 still calculates ceiling(2.5,–1) as a #num! error. however, strangely, ceiling(–2.5,1) no longer returns #num!. excel 2010 now returns –2 which is more like the ceiling.precise result. figure 11.12 contrasts ceiling and ceiling.precise: the floor function rounds a number to the next lowest multiple. suppose that you employ several student workers who do piecework. they assemble products and then pack them six to a case. your contract with the workers states that you pay only for complete cases. column b in figure 11.13 shows the total number of units assembled. you use floor(b6,6) to round this quantity down to the nearest multiple of six. note that if the value is already a multiple of six, as in cell b10, floor does not change the number. c aution you should watch for one strange behavior with mround, floor, and ceiling. if the number is negative, you must ensure that the second argument for the function is also negative. certainly, in some situations you won’t know in advance whether your numbers will be negative. if you think that your numbers might be a mix of positive and negative values, you should use mround(c2,5*sign(c2)). this ensures that the second param-eter matches the sign of the first parameter. figure 11.11 ceiling rounds a number up to the next multiple.us ing everyday funct ions: math, date and time, and text funct ions 210 2 part similar to the problems with ceiling, the scientific community does not agree that floor should round toward zero for negative numbers. in cell b14 of figure 11.13, the floor of -2.5 actually rounds up to –2 instead of down to –3. you can use floor.precise to match the iso standard. as shown in b15, the floor.precise of –2.5 is –3. all the functions for rounding can be replaced with a clever combination of int and round func-tions. if you receive a spreadsheet from an old-time lotus 1-2-3 user, you may see formulas like the ones in figure 11.14: figure 11.12 for negative numbers, iso.ceiling rounds toward zero. figure 11.13 floor rounds a number down to the next multiple.211 examples of math funct ions 11 chapter • cell b13 is equivalent to using mround with a multiple of 20. the formula divides 135 by 20, giv-ing 6.75. round rounds this to 7. finally, outside the parentheses, the formula multiplies by 20 to arrive at the answer of 140. • cell c13 is equivalent to using floor with a significance of 20. the formula divides 135 by 20, giving 6.75. the int removes the decimal places, leaving the integer 6. the formula then multi-plies this result by 20 to arrive at 120. • cell d13 is equivalent to using ceiling with a significance of 20. the formula divides 135 by 20, giving 6.75. next, the formula adds just less than 0.5 to make sure that any value greater than 6 is rounded up to 7. finally, the result is multiplied by 20 to arrive at 140. figure 11.14 a combination of round and int can replace any of the eight other functions used for round-ing. in legacy versions of excel, functions such as mround were not part of the core excel. they were enabled when someone installed the analysis toolpack. because new excel users might never have installed the analysis toolpack, some people avoided using mround and instead wrote the formulas as shown in figure 11.14. now that microsoft has elevated all the analysis toolpack functions to be part of the core excel 2010 product, it is safe to use those functions. using subtotal instead of sum with multiple levels of totals consider the data set shown in figure 11.15. this report shows a list of invoices for each customer. someone has manually inserted rows and used the sum function to total each customer. cells c5, c10, c15, and so on contain a sum function. it would be very difficult to enter a grand total at the bottom of this data set. you might have to enter a long formula that points only at the summary rows. in this particular case, the formula to provide a grand total for 15 customers would be possible, as shown in figure 11.16. if you had 500 customers, however, the formula would be nearly impossible to enter.us ing everyday funct ions: math, date and time, and text funct ions 212 2 part many accountants can teach you the old accounting trick whereby you total the entire column and divide by two to get the grand total. this is based on the assumption that every dollar is in the col-umn twice: once on the detail row and once on the summary row. as shown in figure 11.17, this trick does work, but it is hard to explain to your manager why it works. the solution is to use the subtotal function. this powerful function is relatively new; it was intro-duced in excel 97. see chapter 22 , “using automatic subtotals,” for more information about working with subtotals. syntax: subtotal( function_num,ref1,ref2,... ) figure 11.15 whoever manually summed these rows doesn’t know about the subtotal command on the data tab. figure 11.16 it is difficult to enter the grand total formula.213 examples of math funct ions 11 chapter in its default use, subtotal works just like the sum function, except it throws out other instances of the subtotal function within the range being summed. the subtotal function takes the following arguments: • function_num—this is a number from 1 to 11. the most com-mon function number is the number 9, which (for no apparent logical reason) is used to sum. when microsoft introduced the subtotal function, it offered 11 options: average , count , counta , max , min , product , stdev , stdevp , sum , var , and varp. it just happens that sum is the ninth item in this list when these functions are arranged alphabetically in the english language, so 9 became the function number for sum. although subtotal always ignores rows hidden as the result of a filter, it does not automatically ignore rows hidden with the hide command. to ignore rows that have been manually hidden, add 100 to the function num. • ref1,ref2,. .. —these are up to 254 ranges or references that you want to subtotal. unlike with sum, the references in a subtotal function cannot be 3d references. any other nested subtotals in the range are ignored to prevent double counting. the subtotal function always ignores rows hidden as the result of a filter. this makes the subtotal function great in combination with autofilter, as you’ll see later in this chapter, in figure 11.19. a feature added in excel 2002 is that you can add 100 to the function number to prevent excel from including rows hidden by using the hide command. note that this functionality works only with hid-den rows. if you hide columns and attempt to subtotal in a horizontal fashion, the hidden columns are not ignored. the arguments for subtotal are shown in table 11.6. figure 11.17 the old accounting trick of adding an entire column and divid-ing by two works but is hard to explain. t ip the best way to insert the subtotal function is to use the subtotals icon on the data tab, however, you can set up these functions manually.us ing everyday funct ions: math, date and time, and text funct ions 214 2 part table 11.6 function arguments for subtotal function_num (includes hidden values) function_num (ignores hidden values) function 1 101 average 2 102 count 3 103 counta 4 104 max 5 105 min 6 106 product 7 107 stdev 8 108 stdevp 9 109 sum 10 110 var 11 111 varp in figure 11.18, the customer summary rows were built with the subtotal function, allowing the grand total row to be calculated with the simple formula subtotal(9,c2:c76). in contrast, once someone has built manual subtotals using the sum function as shown previously in figure 11.15, the subtotal function won’t work. you would have to replace sum( with subtotal(9, to convert the existing subtotal lines to the subtotal function.. figure 11.18 when you use subtotal instead of sum for the customer totals, the problem of creating a grand total becomes simple. using subtotal instead of sum to ignore rows hidden by a filter if you are using a filter to query a data set, you can use the subtotal function instead of the sum function to show the total of the visible rows. in figure 11.19, cell e1 contains a sum function, which totals rows whether they are visible or not. cell e2 contains a subtotal function. as you use the215 examples of math funct ions 11 chapter autofilter drop-downs to show just rows for sales of j730 by jamie, the subtotal function updates to reflect the total of the visible rows. this makes the subtotal function a great tool for ad hoc reporting. figure 11.19 the subtotal function in cell e2 ignores rows hid-den as the result of a filter. using rand and randbetween to generate random numbers and data in a number of situations, you might want to generate random numbers. excel offers two functions to assist with this process: rand and randbetween. syntax: rand() the rand function returns an evenly distributed random number greater than or equal to 0 and less than 1. a new random number is returned every time the worksheet is calculated. rand() generates a random decimal between 0 and 0.99999. whether you are a teacher trying to randomly assign the order for book report presentations, or the commissioner of a fantasy football league trying to figure out the draft sequence, rand() can help. if you want to use rand to generate a random number but don’t want the numbers to change every time the cell is calculated, you can enter rand() in the formula bar and then press f9 to change the formula to a random number. to generate a random number greater than or equal to 0 but less than 100, you can use rand()*100. to generate a random sequence for a list, you select a blank column next to your data and enter rand() in the column. every time you press the f9 key, the column generates a new set of ran-dom numbers. you might want to agree up front with the draft participants that you will press f9 three times to randomize the list and then convert the formulas to values. to do so, you follow these steps: 1. enter the heading random in row 1 next to your data. 2. enter rand() in cell b2. note although the function in figure 11.19 uses the function number 109, the subtotal command always ignores rows hid-den as the result of a filter. subtotal(9,e5:e5090) would return an identical result when the rows are hidden through a filter, as in this case. if you have rows hidden by the hide command, you will want to use 109 to ignore the manually hidden rows.us ing everyday funct ions: math, date and time, and text funct ions 216 2 part 3. move the cell pointer to cell b2 and double-click the fill handle. 4. turn off automatic calculation by using formulas, calculation options, manual. 5. press the f9 key three times. 6. choose one cell in column b. 7. from the data tab, click the az button to sort ascending. the new sequence of items in column a is a random sequence (see figure 11.20). you can also use this technique to select a random subset from a data set. if your manager wants you to contact every 20th customer, you can select all the customers where rand() is 0.05 or less. figure 11.20 harriet gets to draft first in this season’s fantasy football league, thanks to the rand function. syntax: randbetween( bottom,top ) whereas rand() returns a random decimal, randbetween generates an integer between two integers. the randbetween function returns a random number between the numbers you specify. a new ran-dom number is returned every time the worksheet is calculated. this function takes the following arguments: bottom—this is the smallest integer randbetween can return. top—this is the largest integer randbetween can return. to generate random numbers between 50 and 59, inclusive, you use randbetween(50,59). randbetween is easier to use than rand to achieve random integers; with rand, you would have to use int(rand()*10)50 to generate this same range of data. even though randbetween generates integers, you can use it to generate sales prices or even let-ters. randbetween(5000,9900)/100 generates random prices between 50.00 and 99.00. the capital letter a is also known as character 65 in the ascii character set. b is 66, c is 67, and so on up through z, which is character 90. you can use char(randbetween(65,90)) to generate random capital letters. many of the product skus in this book were generated using char(randbetween(65,90))& randbetween(101,199).217 examples of math funct ions 11 chapter figure 11.21 shows many examples of randbetween. figure 11.21 rand between can generate integers, or with a little creativity, prices or letters. choosing a random item from a list in figure 11.22, you want to randomly assign employees to certain projects. the list of projects is in column a. the list of employees is in e2:e6. as shown in figure 11.22, the function for b2:b11 is index(e2:e6,randbetween(1,5)). figure 11.22 i wonder if dilbert’s pointy-haired boss assigns projects this way. syntax: roman() to finish movie credits excel can convert numbers to roman numerals. if you stay in the theater after a movie until the end of movie credits, you will see that the copyright date is always expressed in roman numerals. ifus ing everyday funct ions: math, date and time, and text funct ions 218 2 part you are the next steven spielberg, you can use roman(2011) or roman(year(now())) to generate such a numeral. syntax: roman( number,form ) the roman function converts an arabic numeral to roman, as text. this function takes the following arguments: • number—this is the arabic numeral you want converted. • form—this is a number that specifies the type of roman numeral you want. the roman numeral style ranges from classic to simplified, becoming more concise as the value of form increases. there are some arcane rules with roman numerals. in classic roman numbers, an i before a v is used to indicate the number 4. in classic roman numbers, it is valid to use an i before a v or an x, but it is not valid to use an i before an l, a c, a d, or an m. as shown in figure 11.23, the form argument allows excel to bend these rules progressively more: • roman(1999,0) results in mcmxcix. the m is 1000, the cm is 900, the xc is 90, and the ix is 9; 1000 900 90 9 1999 • roman(1999,1) results in mlmvliv. the m is 1000, the lm is 950, the vl is 45, and the iv is 4; 1000 950 45 4 1999. • roman(1999,2) results in mxmix. the m is 1000, the xm is 990, and the ix is 9; 1000 990 9 1999. • roman(1999,3) results in mvmiv. the m is 1000, the vm is 995, and the iv is 4; 1000 995 4 1999. • roman(1999,4) results in mim. the m is 1000 and the im is 999; 1000 999 1999. using abs() to figure out the magnitude of error suppose that you work for a local tv station, and you want to prove that your forecaster is more accurate than those at the other stations in town. the forecaster at the rival station in town is hor-rible—some days he misses high, and other days he misses low. the rival station uses figure 11.24 to say that his average forecast is 99% accurate. all those negative and positive errors cancel each other out in the average. the abs function measures the size of the error. positive errors are reported as positive, and nega-tive errors are reported as positive as well. you can use abs(a2-b2) to demonstrate that the other station’s forecaster is off by 20 degrees on average. caution in a previous book, i joked that if you had bad financial news to share with stockholders, you might try converting your finan-cial statement to roman numer-als. however, you can use the roman function only in limited circumstances. negative num-bers, 0, and numbers over 3,999 cannot be represented with the roman function.219 examples of math funct ions 11 chapter figure 11.23 you can cre-ate movie credit dates with cell a3 or present bad finan-cial news with f1:g13. compare the vari-ous forms of roman numerals in a7:a11. figure 11.24 abs measures the size of an error, ignoring the sign.us ing everyday funct ions: math, date and time, and text funct ions 220 2 part syntax: abs( number ) the abs function returns the absolute value of a number—that is, the number without its sign. with this function, the argument number is the real number of which you want the absolute number. using pi to calculate cake or pizza pricing how many more ingredients are in a 16-inch pizza than an 8-inch pizza? be careful—it is not double! the formula for the area of a circle is r 2 . the radius of a circle is half the diameter. the func-tion pi() returns the constant for pi. you use pi()*(b7/2)2 to calculate the number of square inches in a 16-inch pizza. as shown in figure 11.25, the 16-inch size contains nearly four times the area of an 8-inch circle. figure 11.25 most pizza shops don’t have a dedicated cost accountant. if your company makes anything round—drink coasters, drum heads, wedding cakes, pizzas, or frisbees—you want to use pi() when calculating your product cost.221 examples of math funct ions 11 chapter syntax: pi() the pi function returns the number 3.14159265358979, the mathematical constant, accurate to 15 digits. using combin to figure out lottery probability your office lottery pool may agree to bet 1 on the lottery each week but to double the bet when the jackpot is a higher payout than the odds against winning. the combin function can figure out the number of combinations for most lottery systems. if you have to correctly select 6 numbers out of a pool of 48 numbers, you can use combin(48,6) to find that there are 11.1 million combinations. figure 11.26 shows a variety of lottery odds. figure 11.26 the odds of winning the lottery in a 44-number games are twice as good as in a 50-number game. see chapter 14 , “using statistical functions,” for more information about working with the combin function. using fact to calculate the permutation of a number suppose that you have seven slides in a powerpoint presentation. furthermore, you want to find the number of unique sequences in which the slides can be arranged; this is called the factorial of seven. you calculated this by using 7 6 5 4 3 2 1. to find the factorial of any positive integer, you use the fact function. note the combin function assumes that you don’t care about the sequence of the numbers cho-sen. if you have to worry about the sequence, you should use permut.us ing everyday funct ions: math, date and time, and text funct ions 222 2 part syntax: fact( number ) the fact function returns the factorial of a number. the factorial of a number is equal to 1 2 3 ... number. number is the nonnegative number of which you want the factorial. if number is not an integer, it is truncated. by definition, fact(0) is 1. to figure out how many different ways you can arrange five people in a line, use fact(5). there is a similar function called factdouble. a double facto-rial multiplies every other number. for even numbers, this is a calculation such as factdouble(8) 8*6*4*2. for odd num-bers, the calculation is factdouble(9) 9*7*5*3*1. various factorials are shown in figure 11.27. note it is difficult to find real-world uses for fact double . mathworld.com notes some interesting uses for fact double (n) where n is between -1 and -2, but excel does not calculate fact double for negative num-bers. fans of the poker game texas hold ’em will be delighted to know that fact double is useful in calculating texas hold ’em probabilities. for complete details, look up poker probabilities (texas hold ’em) in wikipedia. figure 11.27 excel calculates the fact and factdouble of various numbers. using gcd and lcm to perform seventh-grade math my seventh-grade math teacher, mr. irwin, taught me about greatest common denominators and least common multiples. for example, the least common multiple of 24 and 36 is 72. the greatest common denominator of 24 and 36 is 12. i have to admit that i never saw these concepts again until my son josh was in seventh grade. this must be permanently part of the seventh-grade curriculum. if you are in seventh grade or you are assisting a seventh grader with his or her math lesson, you will be happy to know that excel can calculate these values for you.223 examples of math funct ions 11 chapter syntax: gcd( number1,number2,... ) the gcd function returns the greatest common divisor of two or more integers. the greatest com-mon divisor is the largest integer that divides both number1 and number2 without a remainder. the arguments number1, number2,... are 1 to 29 values. if any value is not an integer, it is trun-cated. if any argument is nonnumeric, gcd returns a #value! error. if any argument is less than zero, gcd returns a #num! error. the number 1 divides any value evenly. a prime number has only itself and 1 as even divisors. syntax: lcm( number1,number2,... ) the lcm function returns the least common multiple of integers. the least common multiple is the smallest positive integer that is a multiple of all integer arguments—number1, number2, and so on. you use lcm to add fractions with different denominators. the arguments number1, number2,... are 1 to 29 values for which you want the least common multiple. if the value is not an integer, it is truncated. if any argument is nonnumeric, lcm returns a #value! error. if any argument is less than one, lcm returns a #num! error. using multinomial to solve a coin problem although the multinomial distribution is a fairly complex mathematical concept, the following exam-ple illustrates a fun puzzle that can be solved with the function. syntax: multinomial( number1,number2,... ) the multinomial function returns the ratio of the factorial of a sum of values to the product of fac-torials. the arguments number1, number2,... are 1 to 255 values for which you want the multi-nomial. for example, multinomial(a,b,c,d) is (abcd)! / a!b! c! d!. suppose that you have a huge jar that contains hundreds of pennies, nickels, dimes, and quarters. you reach into the jar and pull out six coins. how many possible arrangements of the coins can there be? to picture this problem, you should sort the six types of coins from low to high. you can use three movable dividers to group the coins into denominations. in the left side of figure 11.28, for example, you’ve arranged the dividers to indicate one penny, one nickel, three dimes, and one quar-ter. it is possible to pull out none of a particular coin. in the image on the right, you’ve pulled out five pennies and one dime. in this case, the dividers are adjacent for nickels and pennies. in every case, the quarter divider must always be at the bottom, so how many ways are there to arrange the other three dividers among six coins?us ing everyday funct ions: math, date and time, and text funct ions 224 2 part someone figured out that the answer to this problem is the factorial of (dividers coins) factorial of coins factorial of dividers. in math terms, this is (36)! / 3! 6!. remarkably, excel has a func-tion for solving the coin problem. multinomial(3,6) performs the calculation (36)!/3! 6!. figure 11.28 solving this problem with multinomial will amuse boy scout groups and middle school math students. using mod to find the remainder portion of a division problem the mod function is one of the obscure math functions that i find myself using quite frequently. have you ever been in a group activity where everyone in the group was to count off by sixes? this is a great way to break up a group into six subgroups. it makes sure that friends who were sitting together get put into disparate groups. using the mod function is a great way to perform this concept with records in a database. perhaps for auditing, you need to check every eighth invoice. or you need to break up a list of employees into four groups. you can solve these types of problems by using the mod function. think back to when you were first learning division. if you had to divide 43 by 4, you would have written that the answer was 10 with a remainder of 3. if you divide 40 by 4, the answer is 10 with a remainder of 0. the mod function divides one number by another and reports back just the remainder portion of the result. you end up with an even distribution of remainders. if you convert the formulas into values and sort, your data is broken into similar-size groups.225 examples of math funct ions 11 chapter syntax: mod( number,divisor ) the mod function returns the remainder after number is divided by divisor. the result has the same sign as divisor. this function takes the following arguments: • number—this is the number for which you want to find the remainder. • divisor—this is the number by which you want to divide number. if divisor is 0, mod returns a #div/0! error. the mod function is good for classifying records that follow a certain order. for example, the smartart gallery contains 84 icons arranged with 4 icons per row. to find the column for the 38 th icon, use mod(38,4). the example in figure 11.29 assigns all employees to one of four groups. note mod is short for modulo, the mathematical term for this operation. you would normally say that 17 modulo 3 is 2. figure 11.29 to organize these employees into four groups, use mod(row(),4). then paste the values and sort by the remainders. using quotient to isolate the integer portion in a division problem as you just learned, the mod function isolates the remainder portion in a division problem. the quotient function isolates the integer portion in a division problem. if you divide 43 by 4, the answer is 10 with a remainder of 3. the quotient function returns just the whole number 10 and ignores the remainder. this function is great for calculating full cases of products. suppose you pay a worker for assembling products. you pay the worker for each complete case of 4 items produced. if he produces 43 items in his shift, this is 10 complete cases. quotient(43,4) would provide an answer of 10.us ing everyday funct ions: math, date and time, and text funct ions 226 2 part syntax: quotient( numerator,denominator ) the quotient function returns the integer portion in a division problem. you use this function when you want to discard the remainder in a division problem. this function takes the following arguments: • numerator—this is the dividend. • denominator—this is the divisor. if either argument is nonnumeric, quotient returns a #value! error. many people simulate the quotient function by using the int function. to keep the integer por-tion of a division, you could use int(43/4). however, quotient and int differ when the result is negative. whereas quotient(5,–4) returns –1, int(5/–4) actually goes down to -2. thus, using quotient is more accurate than using int if the results might be negative. if you are a fan of using int to simulate the quotient, consider using trunc() or iso.ceiling() instead. figure 11.30 shows the differences between quotient, int, trunc, and iso.ceiling. figure 11.30 quotient is more accurate than int when the result is negative. using product to multiply numbers the product function multiplies a range of numbers by each other. although you could calculate product(2,2), the product function is designed to multiply all numbers in a range, such as product(a2:a50).227 examples of math funct ions 11 chapter syntax: product( number1,number2,... ) the product function multiplies all the numbers given as arguments and returns the product. the arguments number1, number2,... are 1 to 255 numbers that you want to multiply. if you pass a single-cell argument that contains a text representation of a number, it is used in the multiplication. however, if one of the arguments is a multicell range, any text entry in that range is ignored. in figure 11.31, an array formula in b16 finds all the steps matching a particular book and multiplies the completion flags together. if 100% of the steps for a book are marked with a 1 to indicate com-plete, the product will also be 1. if any step is incomplete, the product will be zero. note that the formula in b16 needs to be completed by holding down ctrlshift while pressing enter. figure 11.31 product can check to see if all steps are nonzero. using sqrt and power to calculate square roots and exponents most calculators offer a square root button, so it seems natural that excel would offer a sqrt function to do the same thing. to square a number, you multiply the number by itself, ending up with a square. for example, 5 5 25. a square root is a number that, when multiplied by itself, leads to a square. for example, the square root of 25 is 5, and the square root of 49 is 7. some square roots are more difficult to calculate. the square root of 8 is a number between 2 and 3—somewhere close to 2.828. you can calculate the number with sqrt(8). a related function is the power function. if you want to write the shorthand for 6 6 6 6 6, you would say “six to the fifth power,” or 6 5 . excel can calculate this with power(6,5). note there is a specialized version of sqrt, sqrtpi. this function is handy for converting square shapes to equivalent-sized round shapes.us ing everyday funct ions: math, date and time, and text funct ions 228 2 part syntax: sqrt( number ) the sqrt function returns a positive square root. the argument number is the number for which you want the square root. if number is negative, sqrt returns a #num! error. syntax: power( number,power ) the power function returns the result of a number raised to a power. this function takes the follow-ing arguments: • number—this is the base number. it can be any real number. • power—this is the exponent to which the base number is raised. the power function works with all sorts of irrational numbers, such as 98.2 raised to the 3.4 power. figuring out other roots and powers the sqrt function is provided because some math people expect it to be there. there are no equiva-lent functions to figure out other roots. if you multiply 5 5 5 to get 125, then the third root of 125 is 5. the fourth root of 625 is 5. even a 30 calculator offers a key to generate various roots beyond a square root. excel does not offer a cube root function. in reality, even the power and the sqrt functions are not necessary. • 63 is 6 raised to the third power, which is 6 x 6 x 6, or 216. • 28 is 2 to the eighth power, which is 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2, or 256. for roots, you can raise a number to a fractional power: • 256(1/8) is the eighth root of 256. this is 2. • 125(1/3) is the third root of 125. this is 5. to review information on how the carat operator is used to calculate powers and roots, refer back to chapter 8 , “understanding formulas.” thus, instead of using sqrt(25), you could just as easily use 25(1/2). however, people read-ing your worksheets are more likely to understand sqrt(25) than 25(1/2). see chapter 15 , “using trig, matrix, and engineering functions,” to read more about the sqrt and sqrtpi functions. using sign to determine the sign of a number although the sign function belongs with the information functions, microsoft groups it with the math functions. you can see it used in the mround function example shown previously in this chapter to prevent an error. simply, sign(number) reports whether number is negative, zero, or positive.229 examples of math funct ions 11 chapter syntax: sign( number ) sign determines the sign of a number. it returns 1 if the number is positive, 0 if the number is 0, and –1 if the number is negative. the argument number is any real number. using countif, averageif, and sumif to conditionally count, average, or sum data the countif and sumif functions are young and popular. in contrast to most functions that have been around since the 1980s, these functions were added in excel 97. the averageif function is even newer, having been added in excel 2007. math purists may point out that you could per-form equivalent calculations by using dsum, or sumproduct, or even an array formula long before microsoft added these functions. however, it is far easier to grasp doing calculations with countif, averageif, and sumif. figure 11.32 shows a database that contains thousands of records. your goal is to find out how many records came from each region. one way to write the formula for the east region is countif(c11:c5011,“east”). however, it is far more interesting to write the formula as shown in cell b2: countif(c11:c5011,a2). after this formula is entered, you can build a table of the unique regions in column a, copy the formula down column b, and quickly have a sum-mary table built with the help of countif. figure 11.32 countif and sumif are simpler to use than dsum, sum-product, or array formulas. syntax: countif( range,criteria) the countif function counts the number of cells within a range that meet the given criteria. this function takes the following arguments:us ing everyday funct ions: math, date and time, and text funct ions 230 2 part • range—this is the range of cells from which you want to count cells. • criteria—this is the criteria in the form of a number, an expression, or text that defines which cells will be counted. for example, criteria can be expressed as 32, “32“, “32“, or “apples“. any criteria that contains text or a mathematical operator must be enclosed in quotes. for numeric criteria, the quotes are not required. • you can use the wildcard characters question mark (?) and asterisk (*) in criteria. a question mark matches any single character; an asterisk matches any sequence of characters. if you want to find an actual question mark or asterisk, you need to type a tilde ( ) before the character. after you have mastered countif, it is easy to master sumif or averageif. in most cases, the sumif function adds one new argument. whereas countif would ask for a range of data and then the value to look for in that range, sumif usually needs three arguments: sumif asks for a range of data, the value to look for in that range, and then another range of data to be summed when a match is found. in figure 11.32, b11:b5011 contains the range to search. cell a2 contains the value for which to search. when excel finds a matching value in column b, you want excel to return the corresponding cell from the revenue column in h11:h5011. most people would write sumif(c11:c5011,a2, h11:h5011) to do this. it turns out that excel forces the third argument to have the same shape as the first argument. if you would happen to accidentally specify h11:h4011, excel would ignore your range and use h11:h5011 because this is the same shape as the first argument. thus, it is suf-ficient to write the formula as sumif(c11:c5011,a2,h11). syntax: sumif( range,criteria,sum_range ) syntax: averageif( range,criteria,average_range ) the sumif function adds the cells specified by a given criteria. the averageif function averages the cells specified by a given criteria. occasionally, the range you want to search is also the range to sum. for example, perhaps your criteria is to look for rows where the revenue is greater than 100,000. in this case, because your range to add is the same as your range to search, you can leave off the third argument, as shown in cell h2 of figure 11.32. the sumif function takes the following arguments: • range—this is the range of cells you want evaluated. • criteria—this is the criteria in the form of a number, an expression, or text that defines which cells will be counted. note an interesting variation on the sumif, averageif, and countif functions is worth mentioning. it is possible to build the criteria argument on-the-fly. to count records that are above average, you can use countif(h11:h50 11,“”&average(h11:h5011)). mastering the sumif and countif functions invariably leads to more questions about doing more powerful versions. if you need to sum based on more than one condition, you can use dsum, sumproduct, or sumifs. the sumifs function is discussed in the next section.231 examples of math funct ions 11 chapter for example, criteria can be expressed as 32, “32“, “32“, or “apples“. • sum_range—this is the range of cells to sum. the cells in sum_range are summed only if their corresponding cells in range match the criteria. if sum_range is omitted, the cells in range are summed. • average_range—if cells in the average_range are empty or contain text or true/false, they are ignored in the calculation of average. see chapter 12 , “using powerful functions: logical, lookup, and database functions,” to learn more about using the dsum function. to learn more about the sumproduct function, see chapter 15 , “using trig, matrix, and engineering functions.” using conditional formulas with multiple conditions: sumifs() , averageifs() , and countifs() when someone sees how easy using sumif() is, they invariably want the function to do more. one of the most frequent questions at the mrexcel message board is along the lines of this: “i am using sumif() to get a total by region. how can i put two conditions in there to only get the total for a certain region and product?” in legacy versions of excel, there were ways to do this, but they were difficult. you had to use either sumproduct(), dsum(), or an array formula. there is a lot of com-plexity in going from a simple sumif() to the complex boolean logic required to understand sumproduct(). thankfully, excel 2007 added plural versions of sumif(), countif(), and averageif() that can handle not just two con-ditions, but up to 127 conditions. the three new functions add the letter s to the end of the function name (that is, sumifs(), countifs(), and averageifs()), to signify that multiple ifs are being considered. with sumifs() and averageifs(), you first specify the range to be summed or averaged. you then specify pairs of arguments. in each pair, you first specify the range to check and then the value to match in that range. the following sec-tions describe these three functions. syntax: sumifs( sum_range,criteria_range1,criteria1[,criteria_range2, criteria2...] ) the sumifs() function adds the cells in a range that meet multiple criteria. note the following in this syntax: • sum_range is the range to sum. • criteria_range1 , criteria_range2 , and so on are one or more ranges in which to evaluate the associated criteria. tip the order of the arguments dif-fers between sumif and sumifs. the sum_range is the first argu-ment in sumifs but the third argument in sumif. it seems pretty common that you would be editing a sumif function to add additional conditions. remember to move the sum_ range to be the first argument when you are moving from sumif to sumifs.us ing everyday funct ions: math, date and time, and text funct ions 232 2 part • criteria1 , criteria2, and so on are one or more criteria in the form of a number, an expres-sion, a cell reference, or text that define which cells will be added. for example, they can be expressed as 32, “32”, “32”, “apples”, or b4. • each cell in sum_range is summed only if all the corresponding criteria specified are true for that cell. • cells in sum_range that contain true evaluate to 1; cells in sum_range that contain false eval-uate to 0. • you can use the wildcard characters question mark ( ?) and asterisk ( *) in criteria. a question mark matches any single character; an asterisk matches any sequence of characters. if you want to find an actual question mark or asterisk, you need to type a tilde ( ) before the character. • unlike the range and criteria arguments in sumif, the size and shape of each criteria_range and sum_range must be the same. in figure 11.33, you want to build a table that shows the total by region and product. sum_range is the revenue in h11:h5011. the first criteria pair consists of the regions in c11:c5011 being compared to the word east in b1. the second criteria pair consists of the divisions in b11:b5011 being compared to g854 in a2. the formula in b2 is sumifs(h11:h5011,c 11:c5011,b1,b11:b5011,a2). you can copy this formula to b2:d6. figure 11.33 the new sumifs() function is used to create this sum-mary by region and product. syntax: averageifs( average_range,criteria_range1,criteria1[,criteria_range2,criteria2...] ) the averageifs() function is similar to sumifs(). it returns the average (arithmetic mean) of all cells that meet multiple criteria. the arguments are the same as for sumifs().233 dates and times in excel 11 chapter syntax: countifs( range1,criteria1[,range2, criteria2...] ) countifs() counts the number of cells in a range that meet multiple criteria. the countifs() syn-tax is a bit different from the syntax of the other new functions. with countifs(), there is no need to specify sum_range. the arguments in countifs() consist of pairs specifying criteria. the first argument in each pair specifies a criteria region. the second argument in each pair specifies the criteria value to match. dates and times in excel date calculations can drive people crazy in excel. if you gain a certain confidence with dates in excel, you will be able to quickly resolve formatting issues that come up. here is why dates are a problem. first, excel stores dates as the number of days since january 1, 1900. for example, june 30, 2011, is 40724 days after 1/1/1900. when you enter 6/30/2011 in a cell, excel secretly converts this entry to 40724 and formats the cell to display a date instead of the value. so far, so good. the problem arises when you try to calculate something based on the date. when you try to perform a calculation on two cells when the first cell is formatted as currency and the second cell is formatted as fixed numeric with three decimals, excel has to decide if the new cell inherits the currency format or the fixed with three decimals format. these rules are hard to figure out. in any given instance, you might get the currency format or the fixed with three decimals format, or you might get the format previously assigned to the cell with the new formula. with num-bers, a result of 80.52 or 80.521 look about the same. you can probably understand either format. however, imagine that one of the cells is formatted as a date. another cell contains the number 30. if you add the 30 to the date, which format does excel use? if the cell containing the new formula happened to be previously assigned a numeric format, the answer suddenly switches from a date format to the numeric equivalent. this is frustrating. it is confusing. you start with june 30, 2011, add 30 days, and get an answer of 40754. this makes no sense to an excel novice. it forces many people to give up on dates and start storing dates as text that look like dates. this is unfortunate because you can’t easily do calculations on text cells that look like dates. here is a general guideline to remember: if you work with dates in the range of the years 2000 to 2020, those numeric equivalents are from 36,526 through 44,196. if you do some date math and get a strange answer in the 35,000–45,000 range, excel probably has the right answer, but the numeric format of the answer cell is wrong. you need to select date from the number drop-down on the home tab to correct the format. the excel method for storing dates is simple when you understand it. if you have a date cell and need to add 15 days to it, you add the number 15 to the cell. every day is equivalent to the number 1, and every week is equivalent to the number 7. this is very simple to understand. when you see 40359 instead of june 30, 2010, excel calls the 40359 a serial number. some of the excel functions discussed here convert from a serial number to text that looks like a date, or vice versa.us ing everyday funct ions: math, date and time, and text funct ions 234 2 part for time, excel adds a decimal to the serial number. there are 24 hours in a day. the serial number for 6 a.m. is 0.25. the serial number for noon is 0.5. the serial number for 6 p.m. is 0.75. the serial number for 3 p.m. on june 30, 2010, is 40359.625. to see how this works, try this out: 1. open a blank excel workbook. 2. in any cell, enter a number in the range of 35,000 to 45,000. 3. add a decimal point and any random digits after the decimal. 4. select that cell. 5. from the home tab, select the dialog launcher in the lower-right corner of the number group. 6. in the date category, scroll down and select the format 3/14/01 1:30 pm. excel displays your random number as a date and time. if the decimal portion of your number is greater than 0.5, the result will be in the p.m. portion of the day. 7. go to another cell and enter the date you were born, using a four-digit year. (this doesn’t work if you are older than 110). 8. again select the cell and format it as a number. excel converts to show how many days after the start of the last century you were born. this is great trivia but not necessarily useful. the point is that excel dates are nothing to be afraid of. you need to understand that behind the scenes, excel is storing your dates as serial numbers and your times as decimal serial numbers. occasionally, circumstances cause a date to be displayed as a serial number. although this freaks some people out, it is easy to fix using the format cells dialog. other times, when you want the serial number (for example, to calculate elapsed days between two dates), excel converts the serial number to a date, indicating, for example, that an invoice is past due by “february 15 1900” days. when you get these types of non sequiturs, you can visit the format cells dialog. caution although most excel date issues can be resolved with formatting, you should be aware of some real date prob-lems: on a macintosh, excel dates are stored since january 1, 1904. if you are using a mac, your serial number for a date in 2010 will be different from that on a windows pc. excel handles this conversion when files are moved from one platform to another. • excel dates cannot handle dates in the 1800s or before. this really hacks off all my friends who do geneal-ogy. if your great-great-great uncle silas was born on february 17, 1895, you are going to have to store that as text. • excel dates from january 1, 1900, through march 1, 1900, are generally wrong. see figure 11.34 and the fol-lowing sidebar for more details. • around y2k, someone decided that 1930 is the dividing line for two digit years. if you enter a date with a two-digit year, the result is in the range of 1930 through 2029. if you enter 12/31/29, this will be interpreted as 2029. if you enter 1/1/30, it will be interpreted as 1930. if you need to enter a mortgage ending date of 2040, for example, be sure to use the four digit year, 6/15/2040.235 dates and times in excel 11 chapter blame it on sisogenes the programmer who designed lotus 1-2-3 was not a date fanatic. back in 45 bc, an astronomer named sisogenes calculated that the earth took 365 days, 5 hours, 48 minutes, and 46 seconds to travel around the sun. he advised julius caesar that this was “close enough” to 365.25 days, and the leap year was born. this worked through caesar’s lifetime. but those missing 11 minutes and 14 seconds began to add up. by 1582, things were out of whack by about 11 days. the spring equinox was falling on march 10 instead of march 21. pope gregory mandated that the calendar jump by 11 days. in catholic countries, they went from october 4, 1582, to october 15, 1582. other countries, though, resisted the change. england finally added the 11 days in 1752. russia added them in 1918. historians note that there was rioting over the change (possibly from all the people who lost out on their birthday cake?). to prevent further rioting, gregory proposed that we skip three leap years out of every 400 years. this led to some arcane rules for leap years: • leap years happen in years divisible by 4. • leap years are skipped if the year is divisible by 100. • leap years are not skipped if the year is divisible by 400. the date february 29, 2000, was actually an exception to an exception to an exception to an exception. but everyone thought it was just another leap year. figure 11.34 a team of astronomers prob-ably worked for hours to calcu-late what now takes seconds in excel.us ing everyday funct ions: math, date and time, and text funct ions 236 2 part understanding excel date and time formats it is worthwhile to learn the various excel custom codes for date and time formats. figure 11.35 shows a table of how march 5 would be displayed in various numeric formats. the codes in a4:a13 show the possible codes for displaying just date, month, or year. most people know the classic mm/dd/yyyy format, but far more formats are available. you can cause excel to spell out the month and weekday by using codes such as dddd, mmmm d, yyyy. these are the possibilities: the problem is that there was no leap year in 1900. the programmer working on lotus 1-2-3 in mitch kapor’s cambridge basement didn’t know this rule and programmed a 2/29/1900 into lotus 1-2-3. by the late 1980s, there were millions of lotus spreadsheets created that had dates in them. any competitor to lotus had to ensure that its program would come up with the exact same result as the industry-standard lotus 1-2-3. this forced excel, quattro, and others to program the same error into their packages. now, billions of spreadsheets exist with dates in them. if microsoft ever corrects this problem, there will again be rioting in the streets. the odds of this problem actually affecting you are slim. you would need to be calculating a date span from before february 28, 1900, to after march 1, 1900. because excel can handle dates going back only to january 1, 1900, only 49 possible starting dates can cause problems. figure 11.35 any of these custom date for-mat codes can be typed in the custom numeric format box.237 dates and times in excel 11 chapter mm—displays the month with two digits. months before october are displayed with a leading zero (for example, january is 01). m—displays the month with one or two digits, as necessary. mmm—displays a three-letter abbreviation for the month (for example, jan, feb). mmmm—spells out the month (for example, january, february). mmmmm—first letter of the month, useful for creating “jfmamjjasond” chart labels. dd—displays the day of the month with two digits. dates earlier than the 10th of the month are displayed with a leading zero (for example, the 1st is 01). d—displays the day of the month with one or two digits, as needed. ddd—displays a three-letter abbreviation for the name of the weekday (for example, mon, tue). dddd—spells out the name of the weekday (for example, monday, tuesday). yy or y— uses two digits for the year (for example, 07). yyyy or yyy— uses four digits for the year (for example, 2007). you are allowed to string together any combination of these codes with a space, comma, slash, or dash. it is valid to repeat a portion of the date format. for example, the format dddd, mmmm d, yyyy shows the day portion twice in the date and would display as monday, march 5, 2011. although the date formats are mostly intuitive, several difficulties exist in the time formats. the first problem is the m code. excel has already used m to mean month. in a time format, you cannot use m alone to mean minutes. the m code must either be preceded or followed by a colon. there is another difficulty: when you are dealing with years, months, and days, it is often perfectly valid to mention only one of the portions of the date without the other two. it is common to hear any of these statements: • “i was born in 1965.” • “i am going on vacation in july.” • “i will be back on the 27th.” if you have a date such as march 5, 2011, and use the proper for-matting code, excel happily tells you that this date is march or 2011 or the 5th. technically, excel is leaving out some really impor-tant information—the 5th of what? as humans, we can often figure out that this probably means the 5th of the next month. thus, we aren’t shocked that excel is leaving off the fact that it is march 2011. tip custom number formats are entered in the format cells dialog. there are three ways to display this dialog: 1. press ctrl1. 2. from the home tab, in the number group, select the drop-down and select more from the bottom of the drop-down. 3. click the expand icon in the lower-right corner of the number group on the home tab. when the format cells dialog is displayed, you select the number tab. in the category list, you select custom. in the type box, you enter your custom format. the sample box displays the active cell with the format applied.us ing everyday funct ions: math, date and time, and text funct ions 238 2 part imagine how strange it would be if excel did this with regular numbers. suppose you have the number 352. would excel ever offer a numeric format that would display just the tens portion of the number? if you put 352 in a cell, would excel display 5 or 50? it would make no sense. excel treats time as an extension of dates and is happy to show you only a portion of the time. this can cause great confusion. to excel, 40 hours really means 1 day and 16 hours. if you create a timesheet in excel and format the total hours for the week as h:mm, excel thinks that you are pur-posefully leaving off the day portion of the format! excel presents 45 hours as just 21 hours because it assumes you can figure out there is 1 day from the context. but our brains don’t work that way; 21 hours means 21 hours, not 1 day and 21 hours. to overcome this problem in excel, you use square brackets. surrounding any time element with square brackets tells excel to include all greater time/date elements in that one element, as in the following examples: • 5 days and 10 hours in [h] format would be 130. • 5 days and 10 hours in [m] format would be 7,800, to represent that many minutes. • 5 days and 10 hours in [s] format would be 468,000, to represent that many seconds. as shown in figure 11.36, the time formatting codes include h, hh, s, ss, :mm, and mm:, all of which can be modified with square brackets. figure 11.36 custom time format codes.239 examples of date and time funct ions 11 chapter to display date and time, you enter the custom date format code, a space, and then the time format code. examples of date and time functions in all the examples in the following sections, you should use care to ensure that the resulting cell is formatted using the proper format, as discussed in the preceding section. using now and today to calculate the current data and time or current date there are a couple keyboard shortcuts for entering date and time. pressing ctrl; enters the cur-rent date in a cell. pressing ctrl: enters the current time in a cell. however, both of these hotkeys create a static value; that is, the date or time reflects the instant that you typed the hotkey, and it never changes in the future. excel offers two functions for calculating the current date: now and today. these functions are excellent for figuring out the number of days until a deadline or how late an open receivable might be. syntax: now() and today() now returns the serial number of the current date and time. today returns the serial number of the current date. the today function returns today’s date, without any time attached. the now function returns the current date and time. both of these functions can be made to display the current date, but there is an important distinction when you are performing cal-culations with the functions. in figure 11.37, column a contains now functions, and column c contains today functions. row 2 is formatted as a date and time. row 3 is formatted as a date. row 4 is formatted as numeric. cell a3 and c3 look the same. if you need to display the date without using it in a calculation, then now or today work fine. row 8 calculates the number of days until a deadline approaches. although most people would say that tomorrow is 1 day away, the formula in a8 would tend to say that the deadline is 0.6969 days away. this can be deceiving. if you are going to use the result of now or today in a date calculation, you should use today to prevent excel from reporting fractional days. the formula in a8 is a7-a3, formatted as numeric instead of a date. c aution it would be nice if now() would function like a real-time clock, constantly updating in excel. however, the result is calculated when the file is opened, with each press of the f9 key, and when an entry is made elsewhere in the worksheet.us ing everyday funct ions: math, date and time, and text funct ions 240 2 part using year, month, day, hour, minute, and second to break a date/time apart if you have a column of dates in july 2011, you can easily make them all look the same by using the mmm-yy format. however, the dates in the actual cells are still different. the july 2011 records are not sorted as if they were a tie. excel offers six functions that you can use to extract a single portion of the date: year, month, day, hour, minute, and second. in figure 11.38, cell a1 contains a date and time. functions in a3 through a8 break out the date into components: • year(date) returns the year portion as a four-digit year. • month(date) returns the month number, from 1 through 12. • day(date) returns the day of the month, from 1 through 31. • hour(date) returns the hour, from 1 to 24. • minute(date) returns the minute, from 1 to 60. • second(date) returns the second, from 1 to 60. in each case, date must contain a valid excel serial number for a date. the cell containing the date serial number may be formatted as a date or as a number. using date to calculate a date from year, month, and day the date function is one of the most amazing functions in excel. microsoft’s implementation of this function is excellent, allowing you to do amazing date calculations. figure 11.37 now and today can be made to look alike, but you need to choose the proper one if you are going to be using the result in a later calcu-lation.241 examples of date and time funct ions 11 chapter syntax: date( year,month,day ) the date function returns the serial number that represents a particular date. this function takes the following arguments: • year—this argument can be one to four digits. if year is between 0 and 1899 (inclusive), excel adds that value to 1900 to calculate the year. for example, date(100,1,2) returns january 2, 2000 (1900100). if year is between 1900 and 9999 (inclusive), excel uses that value as the year. for example, date(2000,1,2) returns january 2, 2000. if year is less than 0 or is 10000 or greater, excel returns a #num! error. • month—this is a number representing the month of the year. if month is greater than 12, month adds that number of months to the first month in the year specified. for example, date(1998,14,2) returns the serial number representing february 2, 1999. • day—this is a number representing the day of the month. if day is greater than the number of days in the month specified, day adds that number of days to the first day in the month. for example, date(1998,1,35) returns the serial number representing february 4, 1998. in a trivial example, date(2011,3,5) returns march 5, 2011. the true power in the date function occurs when one or more of the year, month, or day are calcu-lated values. here are some examples: • if cell a2 contains an invoice date and you want to calculate the day one month later, you use date(year(a2),month(a2)1,day(a2)). • to calculate the beginning of the month, you use date(year(a2),month(a2),1). • to calculate the end of the month, you use date(year(a2),month(a2)1,1)–1. figure 11.38 these six functions allow you to isolate any portion of a date or time.us ing everyday funct ions: math, date and time, and text funct ions 242 2 part the date function is amazing because it enables excel to deal perfectly with invalid dates. if your calculations for month cause it to exceed 12, this is no problem. for example, if you ask excel to calculate date(2010,16,45), excel considers the 16th month of 2010 to be april 2011. to find the 45th day of april 2011, excel moves ahead to may 15, 2011. figure 11.39 shows various results of the date and time functions. figure 11.39 the formulas in column d use date or time func-tions to calculate an excel serial number from three arguments. using time to calculate a time the time function is similar to the date function. it calculates a time serial number given a spe-cific hour, minute, and second. syntax: time( hour,minute,second ) the time function returns the decimal number for a particular time. the decimal number returned by time is a value ranging from 0 to 0.99999999, representing the times from 0:00:00 (12:00:00 a.m.) to 23:59:59 (11:59:59 p.m.). this function takes the following arguments: • hour—this is a number from 0 to 23, representing the hour. • minute—this is a number from 0 to 59, representing the minute. • second—this is a number from 0 to 59, representing the second. as with the date function, excel can handle situations in which the minute or second argument cal-culates to more than 60. for example, time(12,72,120) evaluates to 1:14 pm. additional examples of time are shown in the bottom half of figure 11.39 in the preceding section.243 examples of date and time funct ions 11 chapter using datevalue to convert text dates to real dates it is easy to end up with a worksheet full of text dates. sometimes this is due to importing data from another system. sometimes it is caused by someone not understanding how dates work. if your dates are in many conceivable formats, you can use the datevalue function to convert the text dates to serial numbers, which can then be formatted as dates. syntax: datevalue( date_text ) the datevalue function returns the serial number of the date represented by date_text. you use datevalue to convert a date represented by text to a serial number. the argument date_text is text that represents a date in an excel date format. for example, “1/30/1998” and “30-jan-1998” are text strings within quotation marks that represent dates. using the default date system in excel for windows, date_text must represent a date from january 1, 1900, to december 31, 9999. datevalue returns a #value! error if date_text is out of this range. if the year portion of date_text is omitted, datevalue uses the current year from your computer’s built-in clock. time information in date_text is ignored. any of the text values in column a of figure 11.40 are successfully translated to a date serial number. in this instance, excel should have been smart enough to automatically format the resulting cells as dates. by default, the cells are formatted as numeric. this leads many people to believe that datevalue doesn’t work. you have to apply a date format to achieve the desired result. caution the datevalue function must be used with text dates. if you have a column of values in which some values are text and some are actual dates, using datevalue on the actual dates will cause a #value error. figure 11.40 the formulas in column b use datevalue to convert the text entries in column a to date serial numbers.us ing everyday funct ions: math, date and time, and text funct ions 244 2 part using timevalue to convert text times to real times it is easy to end up with a column of text values that look like times. similarly to datevalue, you can use the timevalue func-tion to convert these to real times. syntax: timevalue( time_text ) the timevalue function returns the decimal number of the time represented by a text string. the decimal number is a value ranging from 0 to 0.99999999, representing the times from 0:00:00 (12:00:00 a.m.) to 23:59:59 (11:59:59 p.m.). the argu-ment time_text is a text string that represents a time in any one of the microsoft excel time formats. for example, “6:45 pm” and “18:45” are text strings within quotation marks that represent time. date information in time_text is ignored. the timevalue function is difficult to use because it is easy for a person to enter the wrong formats. in figure 11.41, many people would interpret cell a8 as meaning 45 minutes and 30 seconds. excel, however, treats this as 45 hours and 30 minutes. this mis-interpretation makes timevalue almost useless for a column of cells that contain a text representation of minute and seconds. the “excel troubleshooting” section later in this chapter discusses how to solve the problem of misinterpreting the timevalue function. frustratingly, excel does not automatically format the results of this function as a time. column b shows the result as excel presents it. column c shows the same result after a time format has been applied. caution there are a few examples of text that datevalue can not recog-nize. one common example is when there is no space after the comma. for example, “january 21,2011” will return an error. to solve this particular problem, use replace to change a comma to a comma space. caution there are a few examples of text that timevalue can not recog-nize. one common example is when there is no space before the am or pm. for example, “11:00pm” will return an error. to solve this particular problem, use replace to change “pm” to “ pm” and to change “am” to “ am”. figure 11.41 the formulas in column b use timevalue to convert the text entries in column a to times. if there is no leading zero before entries with minutes and seconds, the formula produces an unexpected result.245 examples of date and time funct ions 11 chapter using weekday to group dates by day of the week the weekday function would not be so intimidating if people could just agree how to number the days. this one function can give three different results. syntax: weekday( serial_number,return_type ) the weekday function returns the day of the week corresponding to a date. the day is given as an integer, ranging from 1 (sunday) to 7 (saturday), by default. this function takes the following argu-ments: • serial_number —this is a sequential number that represents the date of the day you are trying to find. dates may be entered as text strings within quotation marks (for example, “1/30/1998”, “1998/01/30”), as serial numbers (for example, 35825, which represents january 30, 1998), or as results of other formulas or functions (for example, datevalue (”1/30/1998”)). • return_type—this is a number that determines the type of return value: • if return_type is 1 or omitted, weekday works like the calendar on your wall. typically, cal-endars are printed with sunday on the left and saturday on the right. the default version of weekday numbers these columns from 1 through 7. • if return_type is 2, you are using the biblical version of weekday. in the biblical version, sunday is the seventh day. working backward, monday must occupy the 1 position. • if return_type is 3, you are using the accounting version of weekday. in this version, monday is assigned a value of 0, followed by 1 for tuesday, and so on. this version makes it very easy to group records by week. if cell a2 contains a date, then a2-weekday(a2,3) con-verts the date to the monday that starts the week. figure 11.42 shows the results of weekday for all three return types. figure 11.42 columns b, c, and d compare the weekday function for the three dif-ferent return_type values shown in row 3.us ing everyday funct ions: math, date and time, and text funct ions 246 2 part using weeknum to group dates into weeks weeknum is a disappointing function. it is disappointing because microsoft does not perform the function correctly. microsoft is probably keeping the calculation consistent with some earlier spread-sheets that started doing this incorrectly. however, it would be really easy for microsoft to add a new pair of return_type arguments that would calculate weeknum correctly. syntax: weeknum( serial_num,return_type ) the weeknum function returns a number that indicates where the week falls numerically within a year. this function takes the following arguments: • serial_num—this is a date within the week. • return_type—this is a number that determines on what day the week begins. the default is 1. if return_type is 1 or omitted, the week begins on sunday. if return_type is 2, the week begins on monday. figure 11.43 shows weeknum for the first eight days of each year of the next eight years. rows 11 through 18 show weeknum with a return_type of 1, so the week starts on sunday. look at column c. the first day of the year is a sunday. this works; cells c11:c17 report the first seven days as week 1, and cell c18 reports sunday, january 8, 2012, as the first day of the week for week 2. however, look at b11:b18. in this case, the year 2011 starts on a saturday. the first day of the year is treated as week 1. excel says that week 2 starts on january 2, 2011. it is horrible to have a one-day week starting your year. it guarantees that you will have a significant week 53 at the end of the year. there is an ansi standard for week numbering. this system says that your week 1 must have at least four days. in the ansi system, saturday, january 1, 2011, would be called week 0. in this sys-tem, whichever week contains january 4 is considered week 1. alternative calendar systems and days360 there are many alternative calendar systems that you might have to work with in excel. here are some examples: • manufacturers often redefine a quarter as being composed of 13 workweeks, with the first 4 weeks being called month 1, the next 4 weeks being month 2, and the final 5 weeks being month 3. this is known as a 4-4-5 calendar. • retailers use a special retail calendar composed of 52 7-day weeks. each week ends on a sunday. if you compare week 7, day 6 of one year to week 7, day 6 of another year, you are assured that you are comparing a saturday to a saturday and can have a like comparison. • some accounting systems use a 360-day calendar. in this type of system, the year is divided into 12 months of 30 days. there is special handling for months with 31 days. unfortunately, u.s. and european accounting boards disagree on the special handling, so there are two sets of rules.247 examples of date and time funct ions 11 chapter out of these three alternative calendar systems, excel handles only the 360-day calendar. excel pro-vides the days360 function and the yearfrac function to deal with the date system. syntax: days360( start_date,end_date,method ) the days360 function returns the number of days between two dates, based on a 360-day year (12 30-day months), which is used in some accounting calculations. you use this function to help com-pute payments if your accounting system is based on 12 30-day months. this function takes the fol-lowing arguments: • start_date and end_date—these are the two dates between which you want to know the number of days. if start_date occurs after end_date, days360 returns a negative num-ber. dates may be entered as text strings within quotation marks (for example, “1/30/1998”, “1998/01/30”), as serial numbers (for example, 35825, which represents january 30, 1998, if you’re using the 1900 date system), or as results of other formulas or functions (for example, datevalue(“1/30/1998”)). • method—this is a logical value that specifies whether to use the u.s. or european method in the calculation: figure 11.43 excel calculates week num-bers, but they are out of sync with the rest of the world.us ing everyday funct ions: math, date and time, and text funct ions 248 2 part • false or omitted is a u.s. (national association of securities dealers) method. if the starting date is the 31st of a month, it becomes equal to the 30th of the same month. if the ending date is the 31st of a month and the starting date is earlier than the 30th of a month, the end-ing date becomes equal to the 1st of the next month; otherwise, the ending date becomes equal to the 30th of the same month. • true is a european method. starting dates or ending dates that occur on the 31st of a month become equal to the 30th of the same month. using yearfrac or datedif to calculate elapsed time if you work in a human resources department, you might be concerned with years of service in order to calculate a certain benefit. excel provides one function, yearfrac, that can calculate decimal years of service in five different ways. an old function, datedif, has been hanging around since lotus 1-2-3; it can calculate the difference between two dates in complete years, months, or days. syntax: yearfrac( start_date,end_date,basis ) the yearfrac function calculates the fraction of the year represented by the number of whole days between two dates (start_date and end_date). you use the yearfrac worksheet function to iden-tify the proportion of a whole year’s benefits or obligations to assign to a specific term. this function takes the following arguments: • start_date—this is a date that represents the start date. dates may be entered as text strings within quotation marks (for example, “1/30/1998”, “1998/01/30”), as serial numbers (for example, 35825, which represents january 30, 1998, if you’re using the 1900 date system), or as results of other formulas or functions (for example, datevalue(“1/30/1998”)). • end_date—this is a date that represents the end date. • basis—this is the type of day count basis to use. figure 11.44 compares the five types of basis available: • if basis is 0 or omitted, excel uses a 30/360 plan, modified for american use. in this plan, the employee earns 1/360 of a year’s credit on most days. the employee earns no service on the day after any 31st of the month. in a leap year, the employee earns 2/360 of a year for show-ing up on march 1. in a nonleap year, the employee earns 3/360 of a year for showing up on march 1. • if basis is 1, the actual number of elapsed days is divided by the actual number of days in the year. this method works well and ensures that the year fraction ends up being 1 on the anniversary date, whether it is a leap year.249 examples of date and time funct ions 11 chapter • if basis is 2, the actual number of elapsed days is divided by 360. if someone would show up and work for 30 years straight for one employer, this method would give that person an extra 0.43 years of credit. sisogenes would be spinning in his grave. • if basis is 3, the actual number of elapsed days is divided by 365. this works great for three out of every four years. it is slightly wrong in leap years. • if basis is 4, excel uses a 30/360 plan, modified for european use. this is similar to the default basis of 0. in this plan, the employee gets no credit for working any 31st of the month. the employee still gets triple credit for working march 1 (to make up for the 29th and 30th of february). in a leap year, march 1 is worth only double credit. syntax: datedif( start_date,end_date,unit ) in contrast to yearfrac, the datedif function calculates complete years, months, or days. this function calculates the number of days, months, or years between two dates. it is provided for com-patibility with lotus 1-2-3. this function takes the following arguments: • start_date—this is a date that represents the first, or starting, date of the period. dates may be entered as text strings within quotation marks (for example, “2001/1/30”), as serial num-bers, or as the results of other formulas or functions (for example, datevalue(“2001/1/30”)). • end_date—this is a date that represents the last, or ending, date of the period. • unit—this is the type of information you want returned. the various values for unit are shown in table 23.6. caution datedif has been in excel forever, but it was only documented in excel 2000. why doesn’t microsoft reveal datedif in help? probably because of the strange anomaly when you try to calculate the gap from the 31 st of january to the 1 st of march in a nonleap year. the “d” version of datedif reports this as 29 days. this is correct. the “m” version of datedif reports this as one full month. this has to be correct, because the dates span the entire month of february. the “md” version of datedif reports this as a negative 2 days in excess of a full month. see cell d9 in figure 11.45 . this is simply the downside of trying to express a measurement in months, when the length of a month is not con-stant. negative values for the md version of datedif will happen only when the end date is march 1 or march 2. despite this problem, for 363 days a year, datedif remains an effective way to express a date delta as a certain number of years, months, and days.us ing everyday funct ions: math, date and time, and text funct ions 250 2 part table 11.7 unit values used by the datedif function unit value description y the number of complete years in the period. a complete year is earned on the anniver-sary date of the employee’s start date. m the number of complete months in the period. this number is incremented on the anniversary date. if the employee was hired on january 18, that person has earned one month of service on the 18th of february. if an employee is hired on january 31, then she earns credit for the month when she shows up for work on the 1st after any month with fewer than 31 days. d the number of days in the period. this could be figured out by simply subtracting the two dates. md the number of days, ignoring months and years. you could use a combination of two datedif functions—one using m and one using md—to calculate days. ym the number of months, ignoring years. you could use a combination of two datedif functions—one using y and one using ym— to calculate months. yd the number of days, ignoring complete years. figure 11.44 compares the five types of basis of yearfrac with the six unit values of datedif. each cell uses a1 as the start date and that row’s column a as the end date. figure 11.44 if your ben-efits pack-age includes information about complete months, then yearfrac with a basis value of 0 works best. otherwise, a basis value of 1 is the most accurate.251 examples of date and time funct ions 11 chapter using edate to calculate loan or investment maturity dates if someone invests in a 6-month cd on the 17th of the month, the maturity date is on the 17th of another month. this would be a fairly straightforward calculation if no one invested on the 31st of a month. the maturity rules work such that if you invest on the 31st of a month, and the cd would be sched-uled to mature on the 31st of june, the cd maturity actually happens on the last day of june, which is june 30. if a cd is to mature on the 31st, 30th, or 29th day of february, the cd matures on the last day of february. syntax: edate( start_date,months ) the edate function returns the serial number that represents the date that is the indicated number of months before or after a specified date (that is, start_date ). you use edate to calculate maturity dates or due dates that fall on the same day of the month as the date of issue. this function takes the following arguments: • start_date—this is a date that represents the start date. dates may be entered as text strings within quotation marks (for example, “1/30/1998”, “1998/01/30”), as serial numbers (for example, 35825, which represents january 30, 1998, if you’re using the 1900 date system), or as results of other formulas or functions (for example, datevalue(“1/30/1998”)). if the start_ date is not valid, edate returns a #num! error. • months—this is the number of months before or after start_date. a positive value for months yields a future date; a negative value yields a past date. if months is not an integer, it is truncated. figure 11.45 in rare cases, datedif will report 1 month and -2 days.us ing everyday funct ions: math, date and time, and text funct ions 252 2 part figure 11.46 shows several examples of edate. note that in column b, the function is a no-brainer. you could easily calculate it by using the date function. the only interesting cases occur on the 29th, 30th, and 31st of the month. note that edate can be used to back into an investment date from a maturity date. for example, the records in rows 11 through 16 pass a negative number for the months parameter. figure 11.46 you can use edate to calculate the maturity date for a security. using eomonth to calculate the end of the month before excel 2007, about 89 functions were available only in the analysis toolpack. some compa-nies had rules that you were not allowed to build spreadsheets using the functions in the analysis toolpack. this rule was probably created by some corporate executive who didn’t know how to turn on the analysis toolpack! one of my favorite puzzles at mrexcel.com came from someone who worked at such a company. how can you calculate the end of the month without using eomonth? this is a hard question; the end of the month is the 31st if the month number is 1, 3, 5, 7, 8, 10, or 12. it is the 30th if the month number is 4, 6, 9, or 11. if the month number is 2, then you have to look at the year to figure out if it is a leap year for 29 days or not a leap year for 28 days. the formula to solve this was horrible: date(year(a2),month(a2),choose(month(a2),31,28,31,30,31,30,31,31,30,31,30,31) if(mod(year(a2),4)0,1,0)) well-known excel guru aladin akyurek weighed in with the great answer and ended the entire discussion. aladin suggested using the date function to move up to the first of the next month and then simply subtract one day, using this formula: date(year(a2),month(a2)1,1)-1 the sheer simplicity of this is beautiful. however, the whole question becomes immaterial now that eomonth has been promoted to be part of the actual excel function set.253 examples of date and time funct ions 11 chapter syntax: eomonth( start_date,months ) the eomonth function returns the serial number for the last day of the month that is the indicated number of months before or after start_date. you use eomonth to calculate maturity dates or due dates that fall on the last day of the month. this function takes the following arguments: • start_date—this is a date that represents the starting date. dates may be entered as text strings within quotation marks (for example, “1/30/1998”, “1998/01/30”), as serial numbers, or as results of other formulas or functions (for example, datevalue(“1/30/1998”)). if start_ date is not a valid date, eomonth returns a #num! error. • months—this is the number of months before or after start_ date. a positive value for months yields a future date; a nega-tive value yields a past date. if months is not an integer, it is truncated. if start_date plus months yields an invalid date, eomonth returns a #num! error. eomonth(a2,0) converts any date to the end of the month. using workday or networkdays to calculate workdays if you work in a service industry, allow me to apologize to you on behalf of microsoft. if you work in retail, this section won’t work for you. if you work any schedule where you don’t have two con-secutive days off, this won’t work. the workday functions work only for those people whose work environment is a traditional monday-through-friday week. the new international versions of the workday functions allow for a 2-day weekend to occur on any two consecutive days of the week. if you happen to be in a monday-through-friday environment, the functions workday and networkdays are pretty cool. for example, they are great for calculating shipping days when you ship with fedex or ups. it takes a little work to get the holidays set up with these functions. here’s how you do it: 1. in an out-of-the-way section of a spreadsheet, enter any holidays that will fall during the work-week. this might be federal holidays, floating holidays, company holidays, and so on. the list of holidays can either be entered down a column or across a row. in the top portion of figure 11.47, the holidays are in e2:e7 . 2. enter a starting date in a cell, such as b1. 3. in another cell, enter the number of workdays that the project is expected to take, such as b2. 4. enter the ending date formula as workday(b1,b2,e2:e7). the networkdays function takes two dates and figures out the number of workdays between them. for example, you might have a project that is due on june 17, 2011. if today is april 14, 2011, networkdays can calculate the number of workdays until the proj-ect is due. c aution you have to format the result of the edate formula to be a date to see the expected results. c aution you have to format the result of the eomonth formula to be a date to see the expected results.us ing everyday funct ions: math, date and time, and text funct ions 254 2 part syntax: workday( start_date,days,holidays ) syntax: networkdays( start_date,end_date,holidays ) the networkdays function returns the number of whole workdays between start_date and end_date. workdays exclude weekends and any dates identified in holidays. you use networkdays to calculate employee benefits that accrue based on the number of days worked during a specific term. this function takes the following arguments: • start_date—this is a date that represents the start date. dates may be entered as text strings within quotation marks (for example, “1/30/1998”, “1998/01/30”), as serial numbers, or as results of other formulas or functions (for example, datevalue(“1/30/1998”)). • end_date—this is a date that represents the end date. • holidays—this is an optional range of one or more dates to exclude from the working calendar, such as state and federal holidays and floating holidays. the list can be either a range of cells that contain the dates or an array constant of the serial numbers that represent the dates. if any argument is not a valid date, networkdays returns a #num! error. in figure 11.47, the current date is entered in cell b6. the project due date is entered in cell b7. the holidays range is in e2:e7, as in the previous example. the formula in cell b8 to calculate work-days is networkdays(b6,b7,e2:e7). figure 11.47 workday and networkday can calculate the number of monday-through-friday days, exclusive of a range of holidays.255 examples of date and time funct ions 11 chapter using international versions of workday or networkdays two new functions in excel 2010 expand the workday and networkdays functions for countries where the weekend is not monday through friday. the most common example is a weekend on friday and saturday which has become popular in qatar, bahrain, kuwait, united arab emirates, and algeria. the international versions still require a consecutive two-day weekend. syntax: workday.intl( start_date,days,weekend,holidays ) syntax: networkdays.intl( start_date,end_date,weekend,holidays ) both of these functions work as their noninternational equivalents, with the addition of having the weekend specified as two specific consecutive days of the week. excel in practice: converting a holiday range to an array the problem with putting the list of holidays in a range on a worksheet is that someone might accidentally overwrite or change the range of holidays. the syntax for the workdays functions mentions that the holiday range can be converted to an array of serial numbers. to embed the holidays inside a function, you follow these steps: 1. in figure 11.47 , select cell b3. 2. in the formula bar, use the mouse to select the characters e2:e7. 3. press the f9 key. excel will replace the selected characters with the calculated version of those characters. in this case, the calculation is the array {40544;40693,etc.}, as shown in figure 11.48 . 4. press enter to accept the new formula. 5. you can now delete the holidays in column e. figure 11.48 you can remove the holiday cells from the worksheet after embed-ding the array in the formula.us ing everyday funct ions: math, date and time, and text funct ions 256 2 part here are the values for the weekend argument: 1—weekend on saturday and sunday 2—weekend on sunday and monday 3—weekend on monday and tuesday 4—weekend on tuesday and wednesday 5— weekend on wednesday and thursday 6— weekend on thursday and friday 7— weekend on friday and saturday 11 – sunday only 12 – monday only 13 – tuesday only 14 – wednesday only 15 – thursday only 16 – friday only 17 – saturday only examples of text functions when they think of excel, most people think of numbers. excel is great at dealing with numbers, and it lets you write formulas to produce new numbers. excel offers a whole cadre of formulas for dealing with text. you might sometimes be frustrated because you receive data from other users, and the text is not in the format you need. or the mainframe might send customer names in uppercase, or the employee in the next department might put a whole address into a single cell. excel provides text functions to deal with all these situations and more. joining text with the ampersand (&) operator chapter 8 mentions the ampersand (&) operator, but it is worth mentioning again here because it is the most important tool for dealing with text. the & is an operator that you use to join text. suppose you have a worksheet with first name in column a and last name in column b, as shown in figure 11.49. you need to put these names together in a single cell. if you use the formula a2&b2 in cell c2, excel smashes the names together (for example, stevenwoodward). instead, you have to join three elements. in between a2 and b2, you need to join a single space in double quotes. the formula to do this is a2&” “&b2. some people prefer to use the concatenate function instead of the &. this function does not perform the way that i want it to perform, and i generally avoid it, but it is described in the following section. syntax: concatenate ( text1,text2,... ) the concatenate function joins several text strings into one text string. the arguments text1, text2,... are 1 to 30 text items to be joined into a single text item. the text items can be text strings, numbers, or single-cell references. the problem with this function is that it can select only single cell references. an attempt to use concatenate(a2:b2) returns a #value! error. if you have to enter concatenate(a2,” “,b2),257 examples of text funct ions 11 chapter it is easier to use a2&” “&b2. further, the function can handle only 30 references. if you are joining cells with spaces in between, you will run out of terms after just 15 cells. with the &, you can join more than 30 items. using lower, upper, or proper to convert text case three functions—lower, upper, and proper—convert text to or from capital letters. in figure 11.50, the products in column a were entered in a haphazard fashion. some products used lowercase, and some products used uppercase. column b uses upper(a2) to make all the products a uniform uppercase. in cell e13, text was entered by someone who never turns off caps lock. you can convert this upper-case to lowercase with lower(e13). in column e, you see a range of names in uppercase. you can use proper(e2) to convert the name to proper case, which capital-izes just the first letter of each word. the proper function is mostly fantastic, but there are a few cells that you have to manually cor-rect. proper does correctly capitalize names with apostrophes, such as o’rasi in cell f3. it does not, however, correctly capitalize the interior c in mccartney in cell f4. the function is also notorious for creating company names such as ibm, 3m, and aep. syntax: lower( text ) the lower function converts all uppercase letters in a text string to lowercase. the argument text is the text you want to convert to lowercase. lower does not change characters in text that are not letters. note if you want to keep the data only in column c, you have to convert the formulas to values before deleting columns a and b. to do this, select the data in column c and then press ctrlc to copy. then select home, paste, paste values to convert the formulas to values. figure 11.49 the & character can be used to join text in cells or text enclosed in quotes. note it would be great if microsoft would add a function to convert to sentence case. we can hope to find such a function in future versions of excel.us ing everyday funct ions: math, date and time, and text funct ions 258 2 part syntax: proper( text ) the proper function capitalizes the first letter in a text string and any other letters in text that fol-low any character other than a letter. it converts all other letters to lowercase letters. the argument text is text enclosed in quotation marks, a formula that returns text, or a reference to a cell containing the text you want to partially capitalize. syntax: upper( text ) the upper function converts text to uppercase. the argument text is the text you want converted to uppercase. text can be a reference or text string. using trim to remove trailing spaces if you frequently import data, you might be plagued with a couple of annoying situations. this sec-tion and the next one deal with those situations. figure 11.50 upper, lower, and proper can convert text to and from capital let-ters.259 examples of text funct ions 11 chapter you may have trailing spaces at the end of text cells. although “abc” and “abc” might look alike when viewed in excel, they cause functions such as match and vlookup to fail. trim removes lead-ing and trailing spaces. in figure 11.51, you can see a simple vlookup in column b. the formula in cell b2 is vlookup(a2,f2:g5,2,false). even though you can clearly see that m40498 is in the lookup table, vlookup returns an #n/a! error, indicating that the product id is missing from the lookup table. figure 11.51 this vlookup should work, but in this instance, it fails. to diagnose and correct this problem, follow these steps: 1. select one of the data cells in column f. press the f2 key to put the cell in edit mode. a flashing insertion character appears at the end of the cell. check to see if the flashing cursor is immedi-ately after the last character. 2. select one of the data cells in column a. press the f2 key to put the cell in edit mode. note whether the flashing insertion character is immediately after the last character. figure 11.52 shows that the products in column a have several trailing spaces after them. the products in the lookup table do not have any trailing spaces. insertion cursor figure 11.52 spaces are pad-ding the right side of the prod-ucts in column a. 3. if the problem is occurring in the values being looked up, you could modify the formula in cell b2 to use the trim function. the new formula would be vlookup(trim(a2),f2:g5, 2,false). figure 11.53 shows how this solves the problem.us ing everyday funct ions: math, date and time, and text funct ions 260 2 part 4. if the problem is occurring in the first column of the lookup table, insert a new temporary column. enter the function trim(f2) in the temporary column. copy this formula down to all rows of the lookup table. copy the new formulas and select home, paste, values to paste the new val-ues. although the old and new values look the same, the trim function has removed the trailing spaces, and now the products match. syntax: trim( text ) the trim function removes all spaces from text except for single spaces between words. you use trim on text that you have received from another application that may have irregular spacing. the argument text is the text from which you want spaces removed. in figure 11.54, cell c1 contains six letters: abc def. you might assume that the cell is set to be centered. however, the formula in cell c2 appends an asterisk to each end of the value in cell c1. this formula shows that there are several leading and trailing spaces in the value. using len(c1) shows that the text actually contains 15 charac-ters instead of 6 characters. the trim(c1) formula removes any leading spaces, any trailing spaces, and any extra interior spaces. the function still leaves one space between abc and def because you want to continue to have words separated by a single space. the formulas in cells c5 and c6 confirm that the leading and trailing spaces are removed and that the length of the new value is only seven characters. using clean to remove nonprintable characters from text although trim works great, the clean function no longer works as advertised. clean is designed to remove nonprintable characters from text. besides extra spaces, another annoying problem with data from other systems is that it may contain nonprintable characters. excel offers a function that is supposed to remove nonprintable characters, but microsoft’s definition of a nonprintable character is far too narrow. the function was clearly written before the proliferation of web queries, oracle, and sap. figure 11.53 using trim to remove leading spaces allows vlookup to work. note it is not necessarily efficient to calculate, but you can solve the trailing spaces in column f by using vlookup(a2,trim(f 2:g5),2,false) if you press ctrlshiftenter to accept the formula.261 examples of text funct ions 11 chapter syntax: clean( text ) the clean function removes all nonprintable characters from text. you use clean on text imported from other applications that contains characters that may not print with your operating system. for example, you can use clean to remove some low-level computer code that is frequently at the beginning and end of data files and cannot be printed. the argument text is any worksheet information from which you want to remove nonprintable characters. figure 11.55 shows data in column a with characters that i routinely find in imported data. you might expect the clean function in column b to fix all these problems. if so, you will be highly dis-appointed. in the first pass, clean did not clean any of the bad characters. after i edited cell a2 to use a traditional nonprintable character, cell b2 was able to clean that one character. to figure out exactly how clean works, you need the char function, which is conveniently sched-uled to be discussed next. read on for more of the clean saga. figure 11.54 trim removes leading spaces and extra interior spaces. figure 11.55 clean removes a short list of nonprintable characters. unfortunately, today’s data is littered with a new crop of non-printable characters.us ing everyday funct ions: math, date and time, and text funct ions 262 2 part using the char function to generate any character computers have the capability to display 255 different characters in any given font. for the past 20 years, this set of 255 characters has been known as the ascii (pronounced “ask-key”) charac-ter set. my u.s. keyboard gives me the capability to type 96 of those characters. the keyboard in another country may offer several more or fewer characters, but the point is that you cannot access all 255 characters by using your keyboard. you might have ventured into start, all programs, accessories, system tools, character map to find a particular character in the wingdings character set. also, if you have a favorite symbol, you might have memorized that you can insert the symbol by using a hotkey. for example, if you hold down alt, type 0169 on the numeric keypad, and then release alt, an office program inserts the copyright symbol (). syntax: char( number ) the char function returns the character specified by a number. you use char to translate code page numbers you might get from files on other types of computers into characters. the argument number is a number between 1 and 255 that specifies which character you want. the character is from the character set used by your computer. to figure out which characters were removed by the clean function (refer to the preceding section), i built a table with the numbers from 1 through 255. in figure 11.56, column a contains the character number. column b has the function char(a2) to display that character in the times new roman font. column c has a formula to reveal whether clean removes that character: if(len(clean(b2))0,“yes” ,“not). after you copy these formulas to all 255 rows, you will learn that clean removes characters numbered 1 through 31, 129, 141, 144, and 157. to fit in one screen of cells, figure 11.56 shows all the codes arranged on one page. if you see a strange character in your data, you can learn the character number by using the code function, as described in the following section. using the code function to learn the character number for any character each font set offers 255 different characters, numbered from 1 through 255. old-time computer folks might know some of the popular codes off the top of their heads. for example, a capital a is 65. the capital letters run from 65 to 90, a space is 32, a lowercase letter a is 97, and the other lowercase let-ters run from 98 through 122. t ip although i know a few characters off the top of my head, i usually take a look at all characters in a set by entering code(row()) in cells a1:a255. this returns character 65 in row 65, and so on.263 examples of text funct ions 11 chapter in the early days of personal computers, every computer was packed with a list of the ascii codes for each character. today, with the character map, no one has to memorize character codes. however, in some instances, you might want to learn exactly what character you are seeing in a cell. the code function returns the character code for one character at a time. syntax: code( text ) the code function returns a numeric code for the first character in a text string. the returned code corresponds to the character set used by your computer. the argument text is the text for which you want the code of the first character. this is an important distinction. code returns the code for only the first character in a cell. code(“a”) and code(“abc”) return only 65 to indicate the capi-tal letter a. a new problem began happening in excel in the past few years. people started encountering values with which trim would not remove the spaces from the text. for example, figure 11.57 shows a value in column a that very clearly contains several spaces between the letters a and b. for details using the code function, see the “syntax: mid(text,start_num,num_chars)” section later in this chapter. the code(d2) formula in column e shows the character number for each character in the text. things start out well enough, with a character 65 being returned for the a. they also end up okay, with a character 66 being returned for the b at the end in row 9. however, all the middle characters are returning a character 160 instead of a typical space—character 32. figure 11.56 this figure shows all char values in the times new roman data set. only the characters highlighted in black are removed by clean.us ing everyday funct ions: math, date and time, and text funct ions 264 2 part if you’ve ever created a small web page, you might have learned that browsers ignore consecutive spaces. if you really want to keep two words separated by four spaces, you need to use word 1 word2. i learned this trick some-where on the web and never really thought about what means. it turns out that it is a nonbreaking space. and, you guessed it, a nonbreaking space occupies character position 160, so it looks just like a space. web designers use it all the time to format web pages. consequently, it is ending up in data that people paste into excel from the web, and it is making it appear that trim does not always work. using left, mid, or right to split text one of the newer rules in information processing is that each field in a database should contain exactly one piece of informa-tion. throughout the history of computers, there have been millions of examples of people trying to cram many pieces of information into a single field. although this works great for humans, it is pretty difficult to have excel sort a column by everything in the second half of a cell. column a in figure 11.58 contains part numbers. as you might guess, the part number field con-tains two pieces of information: a three-character vendor code, a dash, and a five-digit part number. when a customer comes in to buy a part, he probably doesn’t care about the vendor. so the real question is, “do you have anything in stock that can fix my problem?” excel offers three functions— left, mid, and right—that allow you to isolate just the first or just the last characters, or even just the middle characters, from a column. figure 11.57 code is instrumental in learning why the trim function won’t work on the data in column a. note excel pros know that they can remove the extra interior spaces by using trim(a2). but if you look at the formula in cell b2, you see that trim is not remov-ing the interior spaces. this requires some additional inves-tigation, and code is the key to solving the problem. because code can work on only the first character in a cell, formulas in columns c and d isolate each character in the text.265 examples of text funct ions 11 chapter syntax: left( text,num_chars ) the left function returns the first character or characters in a text string, based on the number of characters specified. this function takes the following arguments: • text—this is the text string that contains the characters you want to extract. • num_chars— this specifies the number of characters you want left to extract. num_chars must be greater than or equal to zero. if num_chars is greater than the length of text, left returns all of text. if num_chars is omitted, it is assumed to be 1. syntax: right( text,num_chars ) the right function returns the last character or characters in a text string, based on the number of characters specified. this function takes the following arguments: • text—this is the text string that contains the characters you want to extract. • num_chars—this specifies the number of characters you want right to extract. num_chars must be greater than or equal to zero. if num_chars is greater than the length of text, right returns all of text. if num_chars is omitted, it is assumed to be 1. figure 11.58 left makes quick work of extract-ing the vendor code. several vari-eties of mid or right extract the part number.us ing everyday funct ions: math, date and time, and text funct ions 266 2 part syntax: mid( text,start_num,num_chars ) mid returns a specific number of characters from a text string, starting at the position specified, based on the number of characters specified. this function takes the following arguments: • text—this is the text string that contains the characters you want to extract. • start_num—this is the position of the first character you want to extract in text. the first character in text has start_num 1, and so on. if start_num is greater than the length of text, mid returns “” (that is, empty text). if start_num is less than the length of text, but start_num plus num_chars exceeds the length of text, mid returns the characters up to the end of text. if start_num is less than 1, mid returns a #value! error. • num_chars—this specifies the number of characters you want mid to return from text. if num_ chars is negative, mid returns a #value! error. in figure 11.58, it is easy to extract the three-digit vendor code by using left(a2,3). it is a bit more difficult to extract the part number. as you scan through the values in column a, it is clear that the vendor code is consistently three letters. with the dash in the fourth character of the text, it means that the part number starts in the fifth position. if you are using mid, you therefore use 5 as the start_num argument. however, there are a few thousand part numbers in the data set. right up front, in cell a4, is a part number that breaks the rule. luk-04-158 contains six characters after the dash. this might seem to be an isolated incident, but in row 10, bww-bc42tw also contains six characters after the dash. because this type of thing happens in real life, two errors in the first nine records are enough to warrant a little extra attention. the four possible strategies for extracting the part number are listed in g2:g6. they are as follows: • ask mid to start at the fifth character and return a large enough number of characters to handle any possible length (that is, mid(a2,5,100)). • ask mid to start at the fifth character but use trim around the whole function to prevent any trailing spaces from being included (that is, trim(mid(a2,5,100))). • ask mid to start at the fifth character, but calculate the exact number of characters by using the len function (that is, mid(a2,5,len(a2)-4)). • skip mid altogether and ask right to return all the characters after the first dash. this requires you to use the find function to locate the first dash—that is, right(a2,len(a2)-find(“- ”,a2)). using len to find the number of characters in a text cell it seems pretty obscure, but you will find the len function amazingly useful. the len function deter-mines the length of characters in a cell, including any leading or trailing spaces.267 examples of text funct ions 11 chapter syntax: len( text ) the len function returns the number of characters in a text string. the argument text is the text whose length you want to find. spaces count as characters. there are instances in which len can be used in conjunction with left, mid, or right to isolate a portion of text. to review information on this topic, refer back to the example in the previous section. len can also be used to find records that are longer than a certain limit. suppose you are about to order nameplates for company employees. each nameplate can accommodate 15 characters. in figure 11.59, you add the len function next to the names and sort by the length, in descending order. any problem names appear at the top of the list. figure 11.59 len identifies the number of characters in a cell. using search or find to locate characters in a particular cell two nearly identical functions can scan through a text cell, looking for a particular character or word. many times, you just want to know if the word appears in the text. these functions go further than telling you if the character exists in the text; they tell you at exactly which character position the character or word is found. the character position can be useful in subsequent formulas with left, right, or replace. first, let’s look at an example of using find to determine whether a word exists in another cell. figure 11.60 shows a database of customers. the database was created by someone who doesn’t know excel and jammed every field into a single cell.us ing everyday funct ions: math, date and time, and text funct ions 268 2 part here is how to make this work properly: 1. to find all the customers in california, in cell b2, enter find(“, ca”,a2). when you enter the formula, you get a #value! error. this is okay. in fact, it is useful information: it tells you that ca is not found in the first record. 2. copy the formula down to all rows. 3. sort low to high by column b. 98% of the records have a #value! error and sort to the bottom of the list. the few california records have a valid result for the formula in column b and sort to the top of the list, as shown in figure 11.61. find and search are similar to one another. the find function does not distinguish between uppercase and lowercase letters. find identifies ca, ca, ca, and ca as matches for ca. if you need to find a cell with exactly abcdef, you need to use the search command instead of find. also, search allows for wildcard characters in find_text. a question mark (?) finds a single character, and an asterisk (*) finds any number of characters. figure 11.60 when the manager asked an employee to type this in excel, she didn’t realize that the employee had never used excel before. figure 11.61 you don’t care where find found the text; you simply want to divide the list into records with valid values versus errors. n ote like all the other data sets in this book, these names and addresses are randomly gen-erated from lists of the most popular first name, last name, street name, and city names. don’t try to send christmas cards to these people, because none of the addresses exist. and don’t think that the zip codes are real; everything here is completely random.269 examples of text funct ions 11 chapter the find function makes it easy to find the first instance of a par-ticular character in a cell. however, if your text values contain two instances of a character, your task is a bit more difficult. in figure 11.62, the part numbers in column a really contain three seg-ments, each separated by a dash: 1. to find the first dash, enter find(“-”,a2) in column b. 2. to find the second dash, use the optional start_num parameter to the find function. the start_num parameter is a character position. you want the function to start looking after the first instance of a dash. this can be calculated as the result of the first find in column b plus one. thus, the formula in cell c2 is find(“-”,a2,b21). 3. after you find the character positions of the dashes, isolate the various portions of the part number. in column d, for the first part of the number, enter left(a2,b2–1). this basically asks for the left characters from the part number, stopping at one fewer than the first dash. 4. in column e, for the middle part of the number, enter mid(a2,b21,c2–b2–1). this asks excel to start at the character position one after the first dash and then continue for a length that is one fewer than the first dash subtracted from the second dash. 5. in column f, for the final part of the number, enter right(a2,len(a2)–c2). this calculates the total length of the part number, subtracts the position of the second dash, and returns those right characters. c aution the trick with this application of find is to look for something that is likely to be found only in california records. if you had customers in cairo, illinois, they would have also been found by the find command you just used. the theory with this sort of search is that you can quickly check through the few matching records to find false positives. figure 11.62 formulaically isolat-ing data between the first and sec-ond dashes can be done, but it helps to break each number down into small parts. syntax: find( find_text,within_text,start_num ) find finds one text string (find_text) within another text string (within_text) and returns the number of the starting position of find_text from the first character of within_text. you can alsous ing everyday funct ions: math, date and time, and text funct ions 270 2 part use search to find one text string within another, but unlike search, find is case sensitive and doesn’t allow wildcard characters. the find function takes the following arguments: • find_text—this is the text you want to find. if find_text is “” (that is, empty text), find matches the first character in the search string (that is, the character numbered start_num or 1). find_text cannot contain any wildcard characters. • within_text—this is the text that contains the text you want to find. • start_num—this specifies the character at which to start the search. the first character in within_text is character number 1. if you omit start_num, it is assumed to be 1. syntax: search( find_text,within_text,start_num ) search returns the number of the character at which a specific character or text string is first found, beginning with start_num. you use search to determine the location of a character or text string within another text string so that you can use the mid or replace functions to change the text. the search function takes the following arguments: • find_text—this is the text you want to find. you can use the wildcard characters question mark ( ?) and asterisk ( *) in find_text. a question mark matches any single character; an asterisk matches any sequence of characters. if you want to find an actual question mark or asterisk, you type a tilde ( ) before the character. if you want to find a tilde, you type two tildes. if find_text is not found, a #value! error is returned. • within_text—this is the text in which you want to search for find_text. • start_num—this is the character number in w ithin_text at which you want to start searching. if start_num is omit-ted, it is assumed to be 1. if start_num is not greater than zero or is greater than the length of within_text, a #value! error is returned. using substitute and replace to replace characters when you have the ability to find text, you might want to replace text. excel offers two functions for this: substitute and replace. the substitute function is easier to use and should be your first approach. caution if find_text does not appear in within_text, find returns a #value! error. if start_num is not greater than zero, find returns a #value! error. if start_num is greater than the length of within_text, find returns a #value! error.271 examples of text funct ions 11 chapter syntax: substitute( text,old_text,new_text,instance_num ) the substitute function substitutes new_text for old_text in a text string. you use substitute when you want to replace specific text in a text string; you use replace when you want to replace any text that occurs in a specific location in a text string. the substitute function takes the following arguments: • text—this is the text or the reference to a cell that contains text for which you want to substi-tute characters. • old_text—this is the text you want to replace. • new_text—this is the text you want to replace old_text with. • instance_num—this specifies which occurrence of old_text you want to replace with new_text. if you specify instance_num , only that instance of o ld_text is replaced. otherwise, every occurrence of old_text in text is changed to new_text. for example, substitute(“sales data”,“sales”,“cost”) would generate “cost data”. the substitute function works similarly to a traditional find and replace command. compared to the substitute function, the replace function is difficult enough to make even an old program-mer’s head spin. syntax: replace( old_text,start_num,num_chars,new_text ) replace replaces part of a text string, based on the number of characters specified, with a different text string. this function takes the following arguments: • old_text—this is text in which you want to replace some characters. • start_num—this is the position of the character in old_text that you want to replace with new_text. • num_chars—this is the number of characters in old_text that you want replace to replace with new_text. • new_text—this is the text that will replace characters in old_ text. t ip in order to successfully use replace, you have to use func-tions to determine the location and number of characters to replace. in most circumstances, substitute is easier to use.us ing everyday funct ions: math, date and time, and text funct ions 272 2 part using rept to repeat text multiple times the rept function will repeat a character or some text a certain number of times. syntax: rept( text,number_times ) the rept function repeats text a given number of times. you use rept to fill a cell with a number of instances of a text string. this function takes the following arguments: • text—this is the text you want to repeat. • number_times—this is a positive number that specifies the number of times to repeat text. if number_times is 0, rept returns “” (that is, empty text). if number_times is not an integer, it is truncated. the result of the rept function cannot be longer than 32,767 characters. in microsoft word, it is easy to create a row of periods between text and a page number. in excel, you have to resort to clever use of the rept function to do this. in figure 11.63, column a contains a page number. column b contains a chapter title. the goal in column c is to join enough periods between columns b and a to make all the page numbers line up. the number of periods to print is the total desired length, less the length of columns a and b. the formula for cell c2 is b2&rept(“.”,45-(len(a2)len(b2)))&a2. note to make this work, you have to change the font in column c to be a fixed-width font such as courier new. figure 11.63 the rept function can be used to calculate a certain number of repeated entries. using exact to test case for the most part, excel isn’t concerned about case. to excel, abc and abc are the same thing. in figure 11.64, cells a1 and b1 contain the same letters, but the capitalization is different.273 examples of text funct ions 11 chapter the formula in cell c1 tests whether these values are equal. in the rules of excel, abc and abc are equivalent. the formula in cell c1 indicates that the values are equal. to some people, these two text cells may not be equivalent. if you work in a store that sells the big plastic letters that go on theater marquees, your order for 20 let-ter “a” figures should not be filled with an order for 20 letter “a” figures. excel forces you to use the exact function to compare these two cells to learn that they are not exactly the same. t ip an alternative solution is to for-mat column a with the custom format of “ @*. ”. this will show the text in the cell and follow it with a series of periods, enough to fill the current width of the column. figure 11.64 excel usually overlooks differences in capi-talization when deciding whether two values are equal. you can use exact to find out whether they are equal and the same case. syntax: exact( text1,text2 ) the exact function compares two text strings and returns true if they are exactly the same and false otherwise. exact is case sensitive but ignores formatting differences. you use exact to test text being entered into a document. this function takes the following arguments: • text1—this is the first text string. • text2—this is the second text string using text, dollar, and fixed to format a number as text excel is great at numbers. put a number in a cell, and you can format it in a variety of ways. however, when you join a cell containing text with a cell containing a number or a date, excel falls apart. consider figure 11.65. cell a11 contains a date and is formatted as a date. when you join the name in cell b11 with the date in cell a11, excel automatically converts the date back to a numeric serial number. this is frustrating. today, the text function is the most versatile solution to this problem. if you understand the basics of custom numeric formatting codes, you can easily use text to format a date or a number in any conceivable format. for example, the formula in cell c12 uses text(a12,“m/d/y”) to force the date to display as a date. the text function gives you a lot of versatility. to learn the custom formatting codes for a cell, you can select the cell, display the format cells dialog (by pressing ctrl1), and select the custom cat-egory on the number tab. excel shows you the codes used to create that format.us ing everyday funct ions: math, date and time, and text funct ions 274 2 part if you don’t care to learn the number formatting codes, you can use either the dollar or fixed function to return a number as text, with a few choices regarding number of decimals and whether excel should use the thousands separator. the formulas shown in c2:c7 in figure 11.65 return the formatted text values shown in column b. syntax: text( value,format_text ) the text function converts a value to text in a specific number format. formatting a cell with an option on the number tab of the format cells dialog changes only the format, not the value. using the text function converts a value to formatted text, and the result is no longer calculated as a number. the text function takes the following arguments: • value—this is a numeric value, a formula that evaluates to a numeric value, or a reference to a cell that contains a numeric value. • format_text—this is a number format in text form from the category box on the number tab in the format cells dialog box. format_text cannot contain an asterisk (*) and cannot be the general number format. syntax: dollar( number,decimals ) the dollar function converts a number to text using currency format, with the decimals rounded to the specified place. the format used is #,##0.00_);(#,##0.00). the major difference between formatting a cell that contains a number with the format cells dialog and formatting a number directly with the dollar function is that dollar converts its result to text. a number formatted with the figure 11.65 text, dollar, and fixed can be used to format a number as text.275 examples of text funct ions 11 chapter cells command is still a number. you can continue to use numbers formatted with dollar in formulas because microsoft excel converts numbers entered as text values to numbers when it calculates. the dollar function takes the following arguments: • number—this is a number, a reference to a cell that contains a number, or a formula that evalu-ates to a number. • decimals—this is the number of digits to the right of the decimal point. if decimals is negative, number is rounded to the left of the decimal point. if you omit decimals, it is assumed to be 2. syntax: fixed( number,[decimals],[no_commas] ) the fixed function rounds a number to the specified number of decimals, formats the number in decimal format using a period and commas, and returns the result as text. the major difference between formatting a cell that contains a number with the format cells dialog and formatting a number directly with the fixed function is that fixed converts its result to text. a number format-ted with the format cells dialog is still a number. this function takes the following arguments: • number—this is the number you want to round and convert to text. • decimals—this is the number of digits to the right of the decimal point. numbers in microsoft excel can never have more than 15 significant digits, but decimals can be as large as 127. if decimals is negative, number is rounded to the left of the decimal point. if you omit decimals, it is assumed to be 2. • no_commas—this is a logical value that, if true, prevents fixed from including commas in the returned text. if no_commas is false or omitted, the returned text includes commas as usual. using the t and value functions the t and value functions are left over from lotus days. t(“text”) returns the original text. if cell b1 contains the number 123, t(b1) would return an empty text. basically, t() returns the value in the cell only if it is text. value() converts text that looks like a number or a date to the number or the date. using functions for non-english character sets there are 11 more functions that have not been covered in this section. these functions deal with text in character systems where each character takes up more than 1 byte. this is true in many asian languages. note the following functions are beyond the scope of this edition: asc, bahttext, findb, jis, leftb, midb, phonetic, replaceb, rightb, searchb, yen. even so, these functions are described earlier in this chapter in table 23.3 .us ing everyday funct ions: math, date and time, and text funct ions 276 2 part figure 11.66 you can use the left and right text func-tions to provide the arguments for the time function. excel troubleshooting: text times entered as m:ss instead of h:mm:ss in figure 11.66 , column b contains a series of time trial results. when you total the column in cell b12, you realize that all the times were entered as text. the formulas in column c use timevalue(b2) . however, a time such as 2 minutes 50 seconds is converted in the function to 2 hours 50 minutes. in this case, timevalue does not work. there are two alternative strategies: • one solution is to use the time function. in column d, the text times are converted to real times by using the time function. in each case, the hours should be zero. the minutes are left(b2,1). the seconds are right(b2,2). the formula in cell d2 is time(0,left( b2,1),right(b2,2)). you copy this formula down and format the range as a time. • the other solution is to use the concatenation operator to pad the left of column b with 0:0 . this allows the text to work in the timevalue function. the formula in cell e2 is timevalue(“0:0”&b2) . again, you need to copy this formula down and format the range as a time.12 using powerful functions: logical, lookup, and database functions this chapter covers four groups of workhorse functions. if you process spread-sheets of medium complexity, you will turn to logical and lookup functions regularly. • the logical functions, including the ubiquitous if function, help make deci-sions. • the information functions might be less important than they once were, now that microsoft has added the iferror function, but info, cell, and type still come in handy. • the lookup functions include the powerful vlookup, match, and indirect functions. these functions are invaluable, particularly when you are doing something in excel when it would be better to use access. in addition, let’s face it; with 1.1 million rows in excel 2010, we will all do more things in excel that should be done in access. • finally, the database functions provide the dsum functions. even though these functions fell out of favor with the introduction of pivot tables, they are a powerful set of functions that are worthwhile to master. table 12.1 provides an alphabetical list of all excel 2010’s logical functions. detailed examples of these functions are provided later in this chapter.us ing power ful funct ions: logi cal , lookup, and database funct ions 278 2 part table 12.1 alphabetical list of logical functions function description and(logical1,logical2,...) returns true if all its arguments are true; returns false if one or more arguments is false. false() returns the logical value false. if(logical_test,value_if_true, value_if_false) returns one value if a condition specified evaluates to true and another value if it evaluates to false. iferror(value,value_if_error) returns value_if_error if expression is an error and the value itself if otherwise. not(logical) reverses the value of its argument. you use not when you want to make sure a value is not equal to another par-ticular value. or(logical1,logical2,...) returns true if any argument is true; returns false if all arguments are false. true() returns the logical value true. table 12.2 provides an alphabetical list of all excel 2010’s information functions. detailed examples of these functions are provided in the remainder of the chapter. table 12.2 alphabetical list of information functions function description cell( info_type , reference ) returns information about the formatting, location, or contents of the upper-left cell in a reference. error.type( error_val ) returns a number corresponding to one of the error values in microsoft excel or returns an #n/a error if no error exists. you can use error.type in an if function to test for an error value and return a text string, such as a message, instead of the error value. info( type_text ) returns information about the current operating environment. isblank(value) returns true if value refers to an empty cell. iserror(value) returns true if value refers to any error value (that is, #n/a, #value!, #ref!, #div/0!, #num!, #name?, or #null!). iserr(value) returns true if value refers to any error value except #n/a. iseven(number) returns true if number is even and false if number is odd. islogical(value) returns true if value refers to a logical value.279 us ing power ful funct ions: logi cal , lookup, and database funct ions 12 chapter function description isna(value) returns true if value refers to the #n/a (value not available) error value. isnontext(value) returns true if value refers to any item that is not text. (note that this function returns true if value refers to a blank cell.) isnumber(value) returns true if value refers to a number. isodd(number) returns true if number is odd and false if number is even. isref(value) returns true if value refers to a reference. istext(value) returns true if value refers to text. n(value) returns a value converted to a number. na() returns the error value #n/a, which means “no value is avail-able.” you use na to mark empty cells or cells that are missing information to avoid the problem of unintentionally including empty cells in your calculations. when a formula refers to a cell containing #n/a, the formula returns the #n/a error value. type(value) returns the type of value . you use type when the behavior of another function depends on the type of value in a particular cell. table 12.3 provides an alphabetical list of all excel 2010’s lookup functions. detailed examples of these functions are provided later in this chapter. table 12.3 alphabetical list of lookup functions function description address( row_num ,co lumn_num , abs_ num,a1,sheet_text) creates a cell address as text, given specified row and column numbers. areas( reference ) returns the number of areas in a reference. an area is a range of contiguous cells or a single cell. choose( index_num , value1 , value 2, ...) uses index_num to return a value from the list of value arguments. you use choose to select one of up to 254 values, based on the index number. for example, if value1 through value7 are the days of the week, choose returns one of the days when a number between 1 and 7 is used as index_num . column(reference) returns the column number of the given reference. columns(array) returns the number of columns in an array or a refer-ence.us ing power ful funct ions: logi cal , lookup, and database funct ions 280 2 part function description getpivotdata( data_field, pivot_table, [field1],[item1],... ) returns data stored in a pivot table report. you can use getpivotdata to retrieve summary data from a pivot table report, if the summary data is visible in the report. hlookup( lookup_value , table_array , row_index_num , range_lookup ) searches for a value in the top row of a table or an array of values and then returns a value in the same column from a row you specify in the table or array. you use hlookup when your comparison values are located in a row across the top of a table of data and you want to look down a specified number of rows. you use vlookup when your comparison values are located in a column to the left of the data you want to find. hyperlink( link_location , friendly_name ) creates a shortcut or jump that opens a document stored on a network server, an intranet, or the internet. when you click the cell that contains the hyperlink function, excel opens the file stored at link_location. index( array , row_num , column_ num ) returns the value of a specified cell or array of cells within array. index( reference , row_num , column_num , area_num ) returns a reference to a specified cell or cells within reference . indirect( ref_text , a1 ) returns the reference specified by a text string. references are evaluated immediately to display their contents. you use indirect when you want to change the reference to a cell within a formula without changing the formula itself. lookup( lookup_value , lookup_ vector , result_vector ) returns a value from either a one-row or one-column range. this vector form of lookup looks in a one-row or one-column range, known as a vector, for a value and returns a value from the same position in a second one-row or one-column range. this function is included for compatibility with other worksheets. you should use vlookup instead. lookup( lookup_value , array ) returns a value from an array. the array form of lookup looks in the first row or column of an array for the speci-fied value and returns a value from the same position in the last row or column of the array. this function is included for compatibility with other spreadsheet programs. you should use vlookup instead. however, unlike vlookup, the lookup function can process an array of lookup_values. match( lookup_value , lookup_array , match_type ) returns the relative position of an item in an array that matches a specified value in a specified order. you use match instead of one of the lookup functions when you need the position of an item in a range instead of the item itself.281 us ing power ful funct ions: logi cal , lookup, and database funct ions 12 chapter function description offset( reference , rows , cols , height , width ) returns a reference to a range that is a specified number of rows and columns away from a cell or range of cells. the reference that is returned can be a single cell or a range of cells. you can specify the number of rows and the number of columns to be returned. row( reference ) returns the row number of a reference. rows( array ) returns the number of rows in a reference or an array. rtd( progid , server , topic ,[topic2],...) retrieves real-time data from a program that supports com automation. this function was new in excel xp. transpose(array) returns a vertical range of cells as a horizontal range or vice versa. transpose must be entered as an array for-mula in a range that has the same number of rows and columns, respectively, because array has columns and rows. you use transpose to shift the vertical and hori-zontal orientation of an array on a worksheet. for exam-ple, some functions, such as linest, return horizontal arrays. linest returns a horizontal array of the slope and y-intercept for a line. use transpose to convert the linest result to a vertical array. vlookup( lookup_value , table_array , col_index_num , range_lookup ) searches for a value in the leftmost column of a table and then returns a value in the same row from a column you specify in the table. you use vlookup instead of hlookup when your comparison values are located in a column to the left of the data you want to find. table 12.4 provides an alphabetical list of all of excel 2010’s database functions. detailed examples of these functions are provided later in this chapter. table 12.4 alphabetical list of database functions function description daverage( database , field , criteria ) averages the values in a column in a list or data-base that match the conditions specified. dcount( database , field , criteria ) counts the cells that contain numbers in a column in a list or database that match the conditions speci-fied. dcounta( database , field , criteria ) counts all the nonblank cells in a column in a list or database that match the conditions specified. dget( database , field , criteria ) extracts a single value from a column in a list or database that matches the conditions specified.us ing power ful funct ions: logi cal , lookup, and database funct ions 282 2 part function description dmax( database , field , criteria ) returns the largest number in a column in a list or database that matches the conditions specified. dmin( database , field , criteria ) returns the smallest number in a column in a list or database that matches the conditions specified. dproduct( database , field , criteria ) multiplies the values in a column in a list or data-base that match the conditions specified. dstdev( database , field , criteria ) estimates the standard deviation of a population based on a sample, using the numbers in a column in a list or database that match the conditions specified. dstdevp( database , field , criteria ) calculates the standard deviation of a population based on the entire population, using the numbers in a column in a list or database that match the con-ditions specified. dsum( database , field , criteria ) adds the numbers in a column in a list or database that match the conditions specified. dvar( database , field , criteria ) estimates the variance of a population based on a sample, using the numbers in a column in a list or database that match the conditions specified. dvarp( database , field , criteria ) calculates the variance of a population based on the entire population, using the numbers in a column in a list or database that match the conditions specified. table 12.5 provides an alphabetical list of all of excel 2010’s external functions. detailed examples of these functions are provided later in this chapter. table 12.5 alphabetical list of external functions function description call( register_id , argument1 ,...) calls a procedure in a dynamic link library (dll) or code resource. you use this syntax only with a previ-ously registered code resource that uses arguments from the register function. call( file_text , resource , type_text , argument1 ,...) calls a procedure in a dll or code resource. you use this syntax to simultaneously register and call a code resource for the macintosh. call( module_text , procedure , type_text , argument1 ,...) calls a procedure in a dll or code resource. you use this syntax to simultaneously register and call a code resource for windows machines.283 examples of logi cal funct ions 12 chapter function description euroconvert(number, source, target, full_precision, triangulation_ precision) converts a number to euros, converts a number from euros to a euro member currency, or converts a number from one euro member currency to another by using the euro as an intermediary (that is, trian-gulation). the currencies available for conversion are those of the european union (eu) members that have adopted the euro. register.id(file_text, resource, type_text) returns the register id of the specified dll or code resource that has been previously registered. if the dll or code resource has not been registered, this function registers the dll or code resource and then returns the register id for the macintosh. register.id(module_text, procedure, type_text) returns the register id of the specified dll or code resource that has been previously registered. if the dll or code resource has not been registered, this function registers the dll or code re source and then returns the register id for windows. sql.request(connection_string, out-put_ref,driver_prompt, query_text,col_names_logical) connects with an external data source and runs a query from a worksheet. sql.request then returns the result as an array, without the need for macro programming. if this function is not already available, you should install the microsoft excel odbc add-in ( xlodbc.xla). examples of logical functions with only seven functions, the logical function group is one of the smallest in excel. the if function is easy to understand, and enables you to solve a variety of problems. using the if function to make a decision many calculations in our lives are not straightforward. suppose that a manager offers a bonus pro-gram if her team meets its goals. or perhaps a commission plan offers a bonus if a certain profit goal is met. these types of calculations can be solved by using the if function. syntax: if(logical_test,value_if_true,value_if_false) there are three arguments in the if function. the first argument is any logical test that results in a true or false. for example, you might have logical tests such as these: a2100 b5”west” c99d99us ing power ful funct ions: logi cal , lookup, and database funct ions 284 2 part all logical tests involve one of the comparison operators shown in table 12.6. table 12.6 comparison operators comparison operator meaning example equal to c1d1 greater than a1b1 less than a1b1 greater than or equal to a10 less than or equal to a199 not equal to a2b2 the remaining two arguments are the formula or value to use if the logical test is true and the formula or value to use if the logi-cal test is false. when you read an if function, you should think of the first comma as the word then and the second comma as the word otherwise. for example, if(a210,25,0) would be read as “if a210, then 25; otherwise, 0.” figure 12.1 calculates a sales commission. the commission rate is 1.5 percent of revenue. however, if the gross profit percent-age is 50 percent or higher, the commission rate is 2.5 percent of revenue. in this case, the logical test is h250 percent. the formula if that test is true is 0.025*f2. otherwise, the formula is 0.015*f2. you could build the formula as if(h250%,0.025*f2,0.01 5*f2). note mathematicians would correctly note that in both the second and third arguments of the formula if(h250%,0.025*f2,0 .015*f2) , you are multiplying by f2. therefore, you could sim-plify the formula by using if(h 250%,0.025,0.015)*f2 . figure 12.1 in rows 2, 4, and 5 the commission is 1.5 per-cent. in rows 3 and 6 the commission is 2.5 percent. using the and function to check for two or more conditions the previous example had one simple condition: if the value in column h was greater than or equal to 50 percent, the commission rate changed.285 examples of logi cal funct ions 12 chapter however, in many cases you might need to test for two or more conditions. for example, suppose that a retail store manager offers a 25 bonus for every leather jacket sold on fridays this month. in this case, the logical test requires you to determine whether both conditions are true. you can do this with the and function. syntax: and(logical1,logical2,...) the arguments logical1,logical2,... are from 1 to 255 expressions that evaluate to either true or false. the function returns true only if all arguments are true. in figure 12.2, the function in cell f2 checks whether cell e2 is a jacket and whether the date in cell d2 falls on a friday: and(e2”jacket”,weekday(d2,2)5) figure 12.2 the and function is true only when every condition is met. using the and function to compare two lists the and function can handle up to 255 expressions. each expression can contain a range that might contain many instances of true or false. a common issue is figuring out whether two worksheets are identical. in figure 12.3, columns a:e contain the original worksheet. after this worksheet was passed among several co-workers, it ended back at your desk. follow these steps to compare the two worksheets: 1. leave three blank columns—columns f, g, and h—to the right of your original data. 2. copy the data range of the returned worksheet. paste this copy, starting in column i of the origi-nal worksheet. 3. add the heading all match? in column g. 4. add a formula in column g to compare whether each of the cells in the original data set matches the cells in the returned data set. add the formula and(a6i6,b6j6,c6k6,d6l6,e6m6) in cell g6 to compare all five cells in the data set. 5. copy the formula down column g from cell g6 to match the number of rows in the data set. 6. in cell g2, enter an and formula to test whether all the formulas in column g are true. even though this range contains more than 255 cells, it is still valid to use it as one of theus ing power ful funct ions: logi cal , lookup, and database funct ions 286 2 part expressions in the and function. the formula in g2 is and(g6:g999). this is a quick way to find out whether every row is identical without having to scroll through pages of data, looking for a single false result. if cell g2 returns true, you know that the original and returned worksheets are identical. if cell g2 returns false, you know that one or more of the rows were changed. 7. select g6:g999. from the home tab, select find & select, find. the find and replace dialog appears. figure 12.3 and can test whether a large range of logical tests are all true . 8. in the find and replace dialog, type false into the find what box. you must click the options button and change the look in drop-down from formulas to values to find formulas that result in a value of false. using or to check whether any conditions are met you might have a situation in which a certain formula is based on meeting one of several conditions. a sales manager may want to reward big orders and orders from new customers. the manager may offer a commission bonus if the order is over 50,000 or if the customer is a new customer this year. to test whether a particular sale meets either condition, use the or function. the or function returns true if any condition is true and returns false if none of the conditions are true. syntax: or(logical1,logical2,...) the or function checks whether any of the arguments are true. it returns a false only if all the arguments are false. if any argument is true, the function returns true. tip instead of using the and func-tion, you can multiply the conditions. (a6i6)*(b6 j6)*(c6k6)*(d6*l6)*( e6m6) will return 1 if all the conditions are true and zero if any one of the conditions is false. alternatively, you can type and(a6:e6i6:m6) and press ctrlshiftenter to have and evaluate the array of com-parisons.287 examples of logi cal funct ions 12 chapter the arguments logical1,logical2,... are 1 to 255 conditions that can evaluate to true or false. in figure 12.4, the logical test to see if revenue is over 50,000 is e250000. the logical test to see if the customer is new this year is d22010. the structure of this or function is or(d22010,e250000). you can use the or function as the first argument to the if function to produce the formula shown in cell f2: if(or(d22010,e250000),0.025*e2,0.015*e2). figure 12.4 or checks whether a record meets at least one of several criteria. nesting if functions the if function offers only two possible formulas. either the logical test is true and the first for-mula is used, or the logical test is false and the second formula is used. many situations have a series of choices. for example, in a human resources department, annual merit raises may be given based on the employee’s numeric rating in an annual review, in which employees are ranked on a 5-point scale. the rules for setting the raise are as follows: • 4.5 or higher: 5 percent raise • 4 or higher: 4.5 percent raise • 3.25 or higher: 3 percent raise • 2.5 or higher: 1 percent raise • under 2.5: no raise you can build the if statement by following these steps: 1. test for the highest condition first. excel stops testing when the first condition is met. if the first test checks to see if an employee had a rating of higher than 2.5, then anyone from 2.5 to 5 receives a 1 percent raise. in this case, you want to give a 5 percent raise to anyone with a rating of 4.5 or greater. therefore, the formula starts out as if(b24.5,5%,. 2. there is only one argument left in the current if func-tion—the argument for value_if_false. instead of using a value as the third argument, start a second if function to be used if the first test is false. this if function starts out caution these if formulas are hard to read. there is a temptation to use them for situations with very long lists of conditions. whereas excel 2003 prevented you from nesting more than seven levels of if functions, excel 2007 and later allows you to nest up to 64 if statements. before you start nesting that many if statements, you should consider using vlookup , which is explained later in this chapter.us ing power ful funct ions: logi cal , lookup, and database funct ions 288 2 part if(b24,4.5%,. combine this start of an if function with the first if function: if(b24.5, 5%,if(b24,4.5%,. 3. there are still three possible raise levels and only one argument left in the second if function. start a third if function to be used as the value_if_false argument for the second if function: if(b23.25,3%,. at this point, if the employee did not rank above 3.25, only two possibilities are left. the employee is either 2.5 and above for a 1 percent raise, or he or she gets no raise. 4. create the fourth if function: if(b22.5,1%,0). 5. with the four if functions, be careful to provide four closing parentheses at the end of the func-tion: if(b24.5,5%,if(b24,4.5%,if(b23.25,3%,if(b22.5,1%,0%)))) (see figure 12.5). figure 12.5 this formula contains four nested if functions. using the true and false functions there are two remaining functions in the logical group, but you should not need to use either of them. if you encounter a function with either the true or false function, you can replace the func-tion with the value true or false. microsoft added true and false to provide compatibility with other vendors’ spreadsheet programs. a formula such as if(or(a25,b20),true(),false()) can be rewritten as if(or(a25,b2 0),true,false). if you are trying to return true or false, you can simply use the boolean expres-sion: or(a25,b20). using the not function to simplify the use of and and or in the language of boolean logic, there are typically nand, nor, and xor functions, which stand for not and, not or, and exclusive or. to simplify matters, excel offers the not function. syntax: not(logical) quite simply, not reverses a logical value. true becomes false, and false becomes true when processed through a not function.289 examples of logi cal funct ions 12 chapter for example, suppose you need to find all flights landing outside of oklahoma. you can build a mas-sive or statement to find every airport code in the united states. alternatively, you can build an or function to find tulsa and oklahoma city and then use a not function to reverse the result: not(or(a2”tulsa”,a2”oklahoma city”)). using the iferror function to simplify error checking the iferror function, which was introduced in excel 2007, was added at the request of many cus-tomers. to help understand the iferror function, you need to understand how error checking was performed during the 22 years before excel 2007 was released. figure 12.6 shows a typical spreadsheet that calculates a ratio of sales to hours. even though this formula works most of the time, in occasional records, the divisor is zero, and the formula returns a #div/0 error. figure 12.6 the zero in the divisor in row 5 causes a division-by-zero error. the typical way to deal with this in legacy versions of excel was to set up an if function to check whether the divisor was zero: if(c50,0,b5/c5). if the divisor were zero, the formula returns a zero as the result. otherwise, the formula performs the calculation. in legacy versions of excel, it was typical to use this type of if formula on thousands of rows of data. the formula is more complex and takes longer to calculate than the new iferror function. however, this particular formula is tame compared to some of the formulas needed to check for errors. a common error occurs when you use the vlookup function to retrieve a value from a lookup table. in figure 12.7, the vlookup function in cell d2 asks excel to look for the rep number s07 from cell b2 and find the corresponding name in the lookup table of f2:g9. this works great, returning jesse from the table. however, a problem arises when the sales rep is not found in the table. in row 7, rep s09 is new and has not yet been added to the table, so excel returns the #n/a result. if you wanted to avoid #n/a errors, the generally accepted workaround in legacy versions of excel was to write this horrible formula: if(isna(vlookup(b7,f2:g9,2,false)),”new rep”, vlookup(b7,f2:g9,2,false))us ing power ful funct ions: logi cal , lookup, and database funct ions 290 2 part in english, this formula says to first find the rep name in the lookup table. if the rep is not found and returns the #n/a error, then use some other text, which in this case are the words new rep. if the rep is found, then perform the lookup again and use that result. because vlookup was one of the most time-intensive functions, it was horrible to have excel per-form every vlookup twice in this formula. in a data set with 50,000 records, it could take minutes for the vlookup to complete. microsoft wisely added the new iferror function to handle all these error-checking situations. syntax: iferror(value,value_if_error) the advantage of the iferror function is that the calcula-tion is evaluated only once. if the calculation results in any type of an error value such as #n/a, #value!, #ref!, #div/0!, #num!, #name?, or #null!, excel returns the alternate value. if the calculation results in any other valid value, whether it is numeric, logical, or text, excel returns the calculated value. the formula from the preceding section can be rewritten as iferror(vlookup(b7, f2:g9,2,false),”new rep”) (see figure 12.7). this calculation is easier to write and calculates much more quickly than the method required in legacy versions of excel. figure 12.7 an #n/a error means that the value is not in the lookup table. caution if you will be sharing your work-book with people who use legacy versions of excel, you should avoid using iferror . instead, you should test for the various error conditions as described in the next section.291 examples of informat ion funct ions 12 chapter examples of information functions found under the more function icon, the 17 information functions return eclectic information about any cell. ten of the 17 functions are called the is functions because they test for various conditions. using the is functions to test for errors figure 12.8 shows the results of the following four functions for testing error values: • iserror—this function evaluates whether a calculation or value results in any type of error. if people using only excel 2007 or later will use your workbooks, you should use the iferror func-tion instead of iserror. however, if you need to share your workbook with people using legacy versions of excel, you should use iserror, which is usually combined with an if function. here is an example: if(iserror(a2),”unknown”,a2). • iserr—this function is similar to iserror, except it does not report #n/a errors. • isna—this function specifically tests whether a result returns an #n/a error. • error.type—this function lets you know specifically what error is being returned. this function returns a value from 1 through 7 to indicate #null!, #div/0!, #value!, #ref!, #name?, #num!, and # n/a, respectively. it is possible to write a lengthy formula such as the following to decode these values and provide a friendlier error message: if(not(iserror(a2)),a2,choose(error.type(a2),”null value found”, “division by zero”,”invalid value”,”missing reference”,”undefined name”, “numeric error”,”value not available”)) note mathematicians in the audience may suggest that you could just as easily use mod(a2,2)0 to figure out whether a number is even. however, unless you are a mathematician, it is far easier to remember iseven(). figure 12.8 the results of is functions for detect-ing errors.us ing power ful funct ions: logi cal , lookup, and database funct ions 292 2 part using is functions to test for types of values figure 12.9 shows the results for the seven remaining is functions. each of these functions reveals if a value contains a particular type of value: • isblank—this function returns true only if a cell is completely empty. a cell that contains sev-eral spaces is not considered blank. even a cell that contains a single apostrophe and no spaces is not considered blank by the isblank function. it would have been more appropriate if the folks at lotus 1-2-3 would have called this the @isempty function, but you are stuck with the bad name now that it has been in use forever. • iseven—this function indicates if a number is evenly divisible by 2. note that cell c8 is an empty cell, which is considered zero and reports as even. using a date as the value in iseven returns a value, but that value does not make sense. using text or logical values in the iseven function causes a #value! error. • isodd—this function indicates whether a number is not evenly divisible by 2. an empty cell is considered zero and returns false to isodd. the same limitations listed for iseven apply to isodd. in addition, if your value contains decimal places, they are ignored by both the iseven and isodd functions. numbers such as 1.02, 1.2, 1.5, 1.9, 1.99999999 all return true for the isodd function. • islogical—this function indicates if the value is either true, false, or an expression that results in true or false. • istext—this function returns true if the value contains text. this is good for finding values such as abc in cell a16 and for finding cells that look like numbers but are actually stored as text. • isnontext—this returns true for anything that is nontext. numbers, logicals, dates, empty cells, and even error cells return true for isnontext. • isnumber—this function returns true for numeric cells and dates. note that although the empty cell a8 can be calculated as even in cell c8, it returns false to isnumber in cell h8. figure 12.9 the results of is func-tions for detecting certain types of val-ues.293 examples of informat ion funct ions 12 chapter the functions in this section are nearly always used in conjunction with an if function. for example, zip codes in the united states should always be five digits. this causes problems when someone keys in a zip code for certain eastern cities that start with a zero. for example, in cell c6 of figure 12.10, the proper way to key a zip code for portland, maine, is to type an apostrophe and then 04123. most people forget the apostrophe, and excel drops the leading zero, as shown in cell c5. note note a very important distinction here: islogical does not tell you whether a value is false . it merely indicates that the expres-sion results in one of the valid logical values of true or false . figure 12.10 the formula in column d detects nontext zip codes and converts to text with five digits. the formula in column d, if(isnontext(c5),right(“0000”&c5,5),c5), fixes errant zip codes in column c. if the value in column c is nontext, the program pads the left side of the zip code with zeros and then takes the five right-most digits. another use of the is functions is in the formulas for a conditional formatting rule. in figure 12.11, a few cells were entered erroneously as text instead of numbers. setting up a rule to mark any cells where the formula istext(b2) is true reveals the cells that need to be updated. figure 12.11 an istext function is used in conditional formatting to mark any numbers erroneously entered as text.us ing power ful funct ions: logi cal , lookup, and database funct ions 294 2 part for more information on using formulas as rules for conditional formatting, see chapter 9 , “controlling formulas.” using the isref function the isref function tests whether a value is a reference. syntax: isref (value) isref returns true if the value is a valid reference. initially, this function may seem to be useless. after all, inherently you know that a2 is a valid reference, so you would not have to use a function to test it. the following formulas return true: isref(a2), isref(xfd1048576), and isref(a2:z99). the following formulas return false: isref(“a2”), isref(99), and isref(22). isref is useful in one special circumstance. for example, suppose you have designed a spread-sheet with the named range “ expensetotal”. if you are worried that someone might have deleted this particular row, you can check whether expensetotal is still a valid name by using isref(expensetotal). here’s an example: if(isref(expensetotal),expensetotal*2,”named range has been deleted”) using the isref function to check a reference the lookup function indirect allows you to build a cell reference by using a formula. in figure 12.12, the cell address in cell d14 is built using a formula to concatenate a column letter with a row number. cell d15 then uses the indirect function to return the value stored in the cell referenced by the formula in cell d14. as you can imagine, this process is subject to error. someone might enter a negative number, as shown in cell d18. before using the indirect function, you can check if the reference in cell d14 is a valid reference by using isref(indirect (d14)). using the n function to add a comment to a formula you can call excel’s n function a creative use for an obsolete function. lotus 1-2-3 used to offer an n() function that converted a value as follows: • n(any number) returned that number. • n(a date) returned the serial number of the date. • n(true) returned 1. • n(false) returned 0. • n(any error) returned the error. • n(any text) returned 0.295 examples of informat ion funct ions 12 chapter none of these functions is terribly interesting. you can replicate just about any of them by referring to the value and changing the cell format. an interesting unintended use of the function is that n(any text) always returns zero. a useful trick is to insert a comment about a long formula by adding the n function to the end of the formula. however, make sure that your comment contains text. since n(text) is zero, the outcome of the function does not change. when you come back to the formula several months later, you can see the comment in the formula bar (see figure 12.13). figure 12.12 prevent problems with indirect by checking isref(indirect()) first. figure 12.13 because n of text is zero, you can store a comment in the n function.us ing power ful funct ions: logi cal , lookup, and database funct ions 296 2 part using the na function to force charts to not plot missing data suppose that you are in charge of a school’s annual fund drive. each day, you mark the fundraising total on a worksheet by following these steps: 1. in column a, you enter the results of each day’s collection through 9 days of the fund drive (see figure 12.14). figure 12.14 using na in the chart on the right allows the trendline to ignore future miss-ing data points and project a reasonable ending result. 2. enter a formula in column c to keep track of the total collected throughout the fund drive. 3. to avoid making it look like the fund drive collected nothing in days 10 through 14, enter a for-mula in column c to checks whether column a is blank. if it is, the if function inserts a null cell in column c. for example, the formula in cell c15 is if(isblank(a15),””,a15c14). 4. build a line chart based on b1:c15. add a trendline to the chart to predict future fundraising totals.297 examples of informat ion funct ions 12 chapter 5. as shown in columns a:c of figure 12.14, this technique fails. even though the totals for days 10 through 14 are blank, excel charts those days as zero. the linear trendline predicts that your fundraising will go down, with a projected total of just over 2,000. 6. try the same chart again, but this time use the na function instead of “” in the if statement in step 3. the formula is shown in cell h16, and the results are in cell j15. excel understands that na values should not be plotted. the trendline is calculated based on only the data points avail-able and projects a total just under 18,000. in many cases, you are trying to avoid #n/a! errors. however, in the case of charting a calculated column, you might want to have #n/a! to produce the correct look to the chart. using the info function to print information about a computer the remaining information functions tell you some piece of information about a particular cell or about the computer. the info function is left over from lotus 1-2-3. some of the information it pro-vides was useful only in lotus. however, a few of the options may be useful to display in an excel spreadsheet. syntax: info( type_text ) the info function returns information about the current operating environment. the following are valid values for the type_text argument: • directory—returns the folder where the current workbook is saved. if the file is not yet saved, returns #n/a. • numfile—returns the number of open files. this is not just open workbooks, but all files open on the system. • memavail—returns the available memory. this appears to be some old dos version of the mem-ory available. even on a system with 128mb of ram, the total memory reported is about 4mb, so it might be the memory assigned to the partition running excel. • memused—specifies the memory in use by excel. • totmem—returns the total of the previous two results. • origin—returns the text “a:” and the absolute cell address of the upper-left cell visible in the current window. the “a:” prefix is a notation used by lotus 1-2-3 release 3.0. you might think there could be uses for this result. for example, indirect(trim(mid(info (“origin”),2,50))) returns the value shown in the upper-left corner of the visible window. however, note that beginning with excel 2007, you can use the scrollbars to change the upper-left cell, and excel does not recalculate, leaving the origin result incorrect until you change a cell in excel. • osversion—returns the version number of your operating system.us ing power ful funct ions: logi cal , lookup, and database funct ions 298 2 part • recalc—returns either manual or automatic to indicate the current recalculation status. you might provide a hint to the spreadsheet reader with if(info(“recalc”)”manual”,”press f9 to calculate”,””). • release—specifies the release number of excel. for excel 2007, this is 12.0. you might be able to use this information in combination with if and indirect to correctly build a reference to the entire worksheet. system—returns either mac or pcdos to indicate macintosh or windows. figure 12.15 shows the results of several variations of the info function. figure 12.15 a few of the argument values for info() still return useful results. using the cell function the cell function can tell you specific information about a specific cell, or it can tell you specific information about the last cell changed in the worksheet. again, some of the types of information are a bit dated. for example, the color argument was writ-ten in the day when a cell was either black or possibly red if the value was negative. the prefix argument is based on when cells could be left-aligned, centered, or right-aligned. even though excel has offered several levels of indenting for a decade, the prefix version of the cell function does not reveal anything about the indentation level. syntax: cell(info_type,reference) to use the cell function, you specify the type of information and optionally a cell reference. if you specify a cell reference, excel provides information about the cell in the reference. if you leave off the reference, excel returns information about the last cell changed in the workbook.299 examples of informat ion funct ions 12 chapter the argument info_type is a text value that specifies what type of cell information you want. the following are the possible values of info_type and the corresponding results: • contents—returns the value in the upper-left cell in reference. • address—returns the address of the first cell in reference, as text. as shown in cell b5 of figure 12.16, this is always returned in absolute reference style. figure 12.16 the cell function returns infor-mation about a specific cell, in this case, cell a1. • row—returns the row number of the cell in reference. • col—returns the column number of the cell in reference. • filename—returns the filename as text including the full path of the file that contains reference. if the worksheet that contains reference has not yet been saved, empty text ( “”) is returned. interestingly, this argument now also returns the worksheet name if the workbook con-tains multiple worksheets. • format—returns the text value corresponding to the number format of the cell. returns - at the end of the text value if the cell is formatted in color for negative values. if the cell is formatted with parentheses for positive or all values, () is returned at the end of the text value. the values reported as a format reflect old lotus 1-2-3 codes. when you format, excel attempts to convert the current numeric format to an old-style lotus 1-2-3 formatting code. table 12.7 shows some examples. • parentheses—returns 1 if the cell is formatted with parentheses for positive or all values; oth-erwise, returns 0.us ing power ful funct ions: logi cal , lookup, and database funct ions 300 2 part table 12.7 custom codes in excel and lotus 1-2-3 excel format excel custom lotus format code code general general g numeric, no decimal 0 f0 numeric, 2 decimals 0.00 f2 comma, 2 decimals #,##0.00 ,2 currency, 2 decimals #,##0.00_) c2 percent, 1 decimal 0.0% p1 scientific notation 0.00e00 s2 fractions # ?/? g date m/d/yy d4 date d-mmm-y d1 date d-mmm d2 date mmm-yy d3 time h:mm am/pm d7 • color—returns 1 if the cell is formatted in color for negative values; otherwise, returns 0. • prefix—returns the text value corresponding to the “label prefix” of the cell as follows: • returns a single quotation mark ( ‘) if the cell contains left-aligned text. • returns double quotation mark ( “) if the cell contains right-aligned text. • returns a caret ( ) if the cell contains centered text. • returns a backslash ( \) if the cell contains fill-aligned text. • returns an empty text ( “”) if the cell contains anything else. • protect—returns 0 if the cell is not locked and 1 if the cell is locked. remember that by default, all excel cells start with their locked property set to true. the locked property is taken into account only if protection is enabled. this argument for the cell function reports a 1 even if protection is not turned on. • type—returns the text value corresponding to the type of data in the cell as follows: • returns b for blank if the cell is empty. • returns l for label if the cell contains a text constant. caution be careful with this: it is now possible to change column widths without causing excel to calculate. you might have to press f9 to have the result of this formula change.301 examples of informat ion funct ions 12 chapter • returns v for value if the cell contains anything else. • width—returns the column width of the cell, rounded to an integer. each unit of column width is equal to the width of one character in the default font size. • reference—is an optional cell reference. if reference is omitted, cell returns the informa-tion about the last changed cell. refer back to figure 12.16, which shows every cell option for a specific cell: cell a1. for additional examples, see excel help for the cell function. using cell to track the last cell changed if you leave off the second argument of the cell function, excel returns the information about the last cell changed in the workbook. follow these steps to create an interesting watch window of the last cells changed: 1. in an out-of-the-way spot, enter the formula cell(“address”). 2. just below this formula, enter the formula cell(“contents”). 3. just below that formula, enter the formula cell(“filename”). 4. select all three of these cells. 5. from the formulas tab, select the watch window icon. the watch window dialog appears. 6. click the add watch button in the watch window dialog. 7. because initially, the default file widths are not wide enough to show the complete value, drag the vertical bars between the headings in the watch window so that you can see the complete value and formula columns. the other columns for book, sheet, and cell can be made smaller. the result, as shown in figure 12.17, is a floating window that always reveals the last changed cell address and contents. figure 12.17 the watch window always shows the last cell changed and the value of that cell. note that as in this case, the last changed cell might be on another worksheet.us ing power ful funct ions: logi cal , lookup, and database funct ions 302 2 part using type to determine type of cell value the final information function is the type function. you use type(value) to determine whether a value is a number, text, logical, an error value, or an array. note that dates are treated as numbers. syntax: type( value ) the type function returns a numeric code that tells you about the type of value. the type function returns the following values: • 1—for a numeric or date type • 2—for a text type • 4—for a logical type • 16—for an error type • 64—an array type figure 12.18 shows the results for various values in the type function. figure 12.18 the type function returns what type of value is specified as an argument. examples of lookup and reference functions the lookup & reference icon contains 18 functions. the all-star of this group is the venerable vlookup function, which is one of the most powerful and most used functions in excel. as database people point out, a lot of work done in excel should probably be done in access. the vlookup func-tion allows you to perform the equivalent of a join operation in a database. this lookup and reference group also includes several functions that seem useless when considered alone. however, when combined, they allow for some very powerful manipulations of data. the303 examples of lookup and reference funct ions 12 chapter examples in the following sections reveal details on how to use the lookup functions and how to combine them to create powerful results. using the choose function for simple lookups most lookup functions require you to set up a lookup table in a range on the worksheet. however, the choose function allows you to specify up to 254 choices right in the syntax of the function. the formula that requires the lookup should be able to calculate an integer from 1 to 254 in order to use the choose function. syntax: choose( index_num,value1,value2,... ) the choose value chooses a value from a list of values, based on an index number. the choose function takes the following arguments: • index_num—this specifies which value argument is selected. index_num must be a number between 1 and 254 or a formula or reference to a cell containing a number between 1 and 254: • if index_num is 1, choose returns value1; if it is 2, choose returns value2; and so on. • if index_num is a decimal, it is rounded down to the next lowest integer before being used. • if index_num is less than 1 or greater than the number of the last value in the list, choose returns a #value! error. • value1,value2,...—these are 1 to 254 value arguments from which choose selects a value or an action to perform based on index_num. the arguments can be numbers, cell references, defined names, formulas, functions, or text. the example in figure 12.19 shows survey data from a number of respondents. columns b:f indi-cate their responses on five measures of your service. column g calculates an average that ranges from 1 to 5. say that you want to add words to column h to characterize the overall rating from the respondent. the following formula is used in cell h4: choose(g4,”strongly disagree”,”disagree”,”neutral”,”agree”,”strongly agree”) figure 12.19 choose is great for simple choices where the index number is between 1 and 254.us ing power ful funct ions: logi cal , lookup, and database funct ions 304 2 part using vlookup with true to find a value based on a range vlookup stands for vertical lookup. this function behaves differently, depending on the fourth parameter. this section describes using vlookup where you need to choose a value based on a table that contains ranges. suppose that you have a list of students and their scores on a test. the school grading scale is based on these ranges: • 92–100 is an a. • 85–91 is a b. • 70–85 is a c. • 65–69 is a d. • below 65 is an f. follow these steps to set up a vlookup for this scenario: 1. because in this version of vlookup you do not have to list every possible grade, build a table showing the scores where the grading scale changes from one grade to the next. 2. although the published grading scale starts with the higher values, your lookup table must be sorted in ascending sequence. this requires a bit of translation as you set up the table. although the grading scale says below 65 is an f, you need to set up the table to show that an f corre-sponds to any grade at 0 or above. therefore, in cell e2 enter 0, and in cell f2, enter f. 3. continue building the grading scale in successive rows of columns e and f. anything above a 65 is given a d. anything above 70 is given a c. note that this is somewhat counterintuitive because it is the opposite order that you would use if you were building a grading scale using nested if functions. 4. ensure that the numeric values are the leftmost column in your lookup table. in figure 12.20, the lookup table range is e2:f6. when you use vlookup, excel searches the first column of the lookup table for the appropriate score. 5. when using this version of vlookup with ranges, sort the list in ascending order. if you are not sure of the proper order, use the sort command from the home tab to sort the table. 6. because the first argument in the vlookup function is the student’s score, in cell c2, enter vlookup(b2,. 7. because the next argument is the range of the lookup table, be sure to press the f4 key after typ-ing e2:f6 to change to an absolute reference of e2:f6. 8. ensure that the third argument specifies which column of the lookup table should be returned. because the letter grade is in the second column of e2:f6, use 2 for the third argument. 9. ensure that the final argument is either true or simply omitted. this tells excel that you are using the sorted range variety of lookup.305 examples of lookup and reference funct ions 12 chapter 10. after you enter the formula in cell c2, again select cell c2 and double-click the fill handle to copy the formula down to all students. figure 12.20 the vlookup formula in column c finds the correct grade from the table in columns e and f. using vlookup with false to find an exact value in some situations, you do not want vlookup to return a value based on a close match. instead, you want excel to find the exact match in the lookup table. figure 12.21 shows a table of sales. the original table had just columns a through c: rep#, date, and sale amount. although a data analyst might have all the rep numbers memorized, the manager who is going to see the report prefers to have the rep names on the report. to fill in the rep names from a lookup table, you follow these steps: 1. in columns f and g, enter a table of rep numbers and rep names. note that it is not important that this table be sorted by the rep number field. it is fine that the table is alphabetical. 2. use false as the fourth parameter in vlookup. you need to do this because close matches are not acceptable here. if something was sold by a new rep with number r9, you do not want to give credit to the name associated with r8 just because it is a close match. either excel finds an exact match and returns the result, or excel does not give you a result. 3. for cell d2, you want excel to use the rep number in a2, so in cell d2, enter vlookup(a2,. 4. the lookup table is in f2:g7, so type f2:g7 and then press the f4 key to make the reference absolute. this allows you to copy the formula in step 7. after pressing f4, type a comma. 5. in the lookup table, the rep name is in column 2 of the table, so type 2 to specify that you want to return the second column of the lookup table. 6. finish the function with false). press ctrlenter to accept the formula and keep the cursor in cell d2. 7. double-click the fill handle to copy the formula down to all the rows.us ing power ful funct ions: logi cal , lookup, and database funct ions 306 2 part 8. vlookup is a very time-intensive calculation. having thou-sands of vlookup formulas significantly affects your recalcula-tion times. in this particular case, you have successfully added rep names. it would be appropriate to convert these live for-mulas to their current values. therefore, press ctrlc to copy. then, from the home tab, select paste, paste values to convert the formulas to values. 9. look through the results. if a sale was credited to a new rep who is not in the table, the name appears as #n/a. manually fix these records, if needed. note if your lookup table is arranged with the key field in row 1, you should use hlookup, which is discussed later in this chapter. if your data is vertical but the key field is not the leftmost column, you can use a combination of index and match, also explained later in this chapter. figure 12.21 in this case, vlookup needs to find the exact rep number from the table in columns e and f. to recap, the two versions of the vlookup formula behave very differently. vlookup with false as the fourth parameter looks for an exact match, whereas vlookup with true as the fourth param-eter looks for the closest (lower) match. in the true version, the lookup table must be sorted. in the false version, the table can be in any sequence. in every case, the key field must be in the left col-umn of the lookup table. syntax: vlookup( lookup_value,table_array,col_index_num,range_lookup ) vlookup searches for a value in the leftmost column of a table and then returns a value in the same row from a column you specify in the table. the vlookup function takes the following arguments: • lookup_value—this is the value to be found in the first column of the table. lookup_value can be a value, reference, or text string. • table_array—this is the table of information in which data is looked up. you can use a refer-ence to a range such as e2:f9 or a range name such as reptable.307 examples of lookup and reference funct ions 12 chapter • col_index_num—this is the column number in table_array from which the matching value must be returned. a col_index_num value of 1 returns the value in the first column in table_ array; a col_index_num value of 2 returns the value in the second column in table_array, and so on. if col_index_num is less than 1, vlookup returns the #value! error value; if col_ index_num is greater than the number of columns in table_array, vlookup returns the #ref! error value. • range_lookup—this is a logical value that specifies whether vlookup should find an exact match or an approximate match. if it is true or omitted, an approximate match is returned. in other words, if an exact match is not found, the next largest value that is less than lookup_ value is returned. if it is false, vlookup finds an exact match. if one is not found, the error value #n/a is returned. if vlookup cannot find lookup_value and if range_lookup is true, it uses the largest value that is less than or equal to lookup_value. if lookup_value is smaller than the smallest value in the first column of table_array, vlookup returns an #n/a error. if vlookup cannot find lookup_value, and range_lookup is false, vlookup returns an #n/a error. using vlookup to match two lists if excel is used throughout your company, you undoubtedly have many lists in excel. people use excel to track everything. how many times are you faced with a situation in which you have two versions of a list and you need to match them up? in figure 12.22, the worksheet has two simple lists. column a shows last week’s version of who was coming to an event. column c shows this week’s version of who is coming to an event. follow these steps if you want to find out quickly if anyone is new: 1. add the heading there? to cell d2. 2. because the formula in cell d3 should look at the value in cell c3 to see if that person is in the original list in column a, start the formula with vlookup(c3,a3:a15,. 3. because your only choice for the column number is to return the first column from the original list, finish the function with 1,false). then press ctrlenter to accept the formula and stay in cell d3. 4. double-click the fill handle to copy the formula down to all rows. for any cells where column d contains a name, it means that the person was on the rsvp list from last week. if the result of the vlookup is #n/a, you know that this person is new since the previ-ous week. tip if you study the data in figure 12.22 , you will see that three more names are in the column c list than in the column a list, yet four people were reported as being new this week. this means that one of the people from last week has dropped off the list. to quickly find who dropped off the list, use the formula vlookup (a3,c3:c18,1,false) in b3:b15 to find that donald tyler has dropped off the list. note that you can also use match to solve this problem.us ing power ful funct ions: logi cal , lookup, and database funct ions 308 2 part using column to assist with vlookup when filling a wide table this section discusses some special considerations to keep in mind when you have to retrieve many columns from a table. if you think carefully about the first formula, you can copy the first formula to the entire table quickly. figure 12.23 shows a table of several hundred skus, starting in row 21. for each sku, the table contains the inventory of that product on hand in the 12 regional warehouses. range a6:b13 con-tains a customer order for various skus. you want to build a table to help visualize which ware-house has most of the items in stock. if you find one warehouse that has all the inventory, you can minimize order shipping costs by shipping the entire order from that particular warehouse. to solve this problem, follow these steps: 1. copy the range of warehouse names from b20:m20 to c5:n5. 2. think about the third argument in the vlookup function. for the formula in column c, you want to return the second column from the table. for the formula in column d, you want to return the third column. if you actually enter the 2 in the formula in column c, then after copying the for-mula over to d:n, you have to edit the third argument repeatedly. 3. create a range above your table, perhaps in row 4, that contains the numbers 2 through 13. you can then use cells in this row when building the third argument in the formula. in cell c4, enter the function column(b2). because column b is the second column, this formula returns 2. 4. select cell c4. drag the fill handle to the right to copy the formula over through column n. the cell b2 reference is relative, resulting in the formula returning the numbers 2, 3, 4, and so on. figure 12.22 an #n/a error as the result of vlookup tells you that the per-son is new to the list.309 examples of lookup and reference funct ions 12 chapter 5. in cell c6, enter vlookup(b6. when you later copy this formula, you always want the formula to point to column b, but you want to allow the formula to point to rows 7, 8, and so on. if you press the f4 key three times, the reference changes to b6. type a comma. 6. type a21:m176. press f4 to change this reference to a21:m176. type a comma. 7. for the third argument, you want to point to the number 2 in cell c4. you always want this part of the formula to point to row 4, and you want to allow the column letter to change as the for-mula is copied to the right. press the f4 key twice to change the reference to c4. 8. finish the formula with ,false). press ctrlenter to accept the formula and stay in cell c4. 9. optionally, add a conditional format to cell c4 to highlight the cell if this formula is true: c6a6. 10. double-click the fill handle to copy the formula to c4:c13. 11. drag the fill handle from the corner of c13 to the right until you have filled in the formula in the range of c:n. the result is a table that shows the current inventory for each item, by warehouse. if you added the conditional formatting in step 9, you can quickly see which warehouses can fulfill most of the order. figure 12.23 the column function in row 4 ensures that you can enter the vlookup formula once and copy it to the entire rect-angular range.us ing power ful funct ions: logi cal , lookup, and database funct ions 310 2 part although having the column function in row 4 allows you to visually understand the example bet-ter, you can eliminate row 4 and rewrite the formula in cell c6 as vlookup(b6,a21:m176, column(b1),false). syntax: column( reference ) the column function returns the column number of a given reference. this function takes the argu-ment reference, which is the cell or range of cells for which you want the column number. if refer-ence is omitted, it is assumed to be the reference of the cell in which the column function appears. if reference is a range of cells, and if column is entered as a horizontal array, column returns the column numbers of reference as a horizontal array. in this case, reference cannot refer to multiple areas. using hlookup for horizontal lookup tables hlookup stands for horizontal lookup. this function is similar to vlookup. hlookup operates in two distinct manners, based on the fourth parameter. if the fourth parameter is the value false, then hlookup is looking for an exact match in the top row of the table. this is fine when you are looking up product codes, customer numbers, or any other discrete bits of information. however, if the fourth parameter is the value true or is omitted, hlookup is treating the first row of the table as a sorted range of values. excel looks for the closest lower value than the one you speci-fied. this is fine when you are trying to determine in which range a value belongs. syntax: hlookup( lookup_value,table_array,row_index_num,range_lookup ) the hlookup function searches for a value in the top row of a table. when the value is found, hlookup returns a value from a particular row in the column. this function takes the following argu-ments: • lookup_value—this is a value to be found in the first row of the table. lookup_value can be a value, a reference, or a text string. • table_array—this is a table of information in which data is looked up. you use a reference to a range or a range name. the values in the first row of table_array can be text, numbers, or logi-cal values. if range_lookup is true, the values in the first row of table_array must be placed in ascending order such as ..., -2, -1, 0, 1, 2,...; a–z; or false, true. otherwise, hlookup may not give the correct value. if range_lookup is false, table_array does not need to be sorted. the search is not case-sensitive: uppercase and lowercase text are equivalent. • row_index_num—this is the row number in table_array from which the matching value is returned. a row_index_num of 1 returns the first row value in table_array, a row_index_num of 2 returns the second row value in table_array, and so on. if row_index_num is less than 1, hlookup returns a #value! error; if row_index_num is greater than the number of rows in311 examples of lookup and reference funct ions 12 chapter table_array, hlookup returns a #ref! error. • range_lookup—this is a logical value that specifies whether you want hlookup to find an exact match or an approximate match. if it is true or omitted, an approximate match is returned. in other words, if an exact match is not found, the next largest value that is less than lookup_ value is returned. if it is false, hlookup finds an exact match. if one is not found, the error #n/a is returned. even though you are probably familiar with sorting a list from top to bottom, most people rarely sort a list from left to right. if you are using the true version of hlookup, make sure that your table is sorted from left to right by the top row. to sort data from left to right, follow these steps: 1. select your range of data. in figure 12.24, this is g3:l8. figure 12.24 the table in f:l is hori-zontal, so you use the hlookup function. 2. from the home tab, select the sort & filter drop-down. the sort dialog appears. 3. in the sort dialog, click the options button. the sort options dialog appears. 4. in the sort options dialog, select sort left to right. click ok to close the sort options dialog. 5. in the sort dialog, choose to sort by row 3. click ok to sort. figure 12.24 shows a tool used by the advertising department of a retail store. the store runs annual promotions for certain holidays. the table in f3:l8 tells the days for holidays in each of sev-eral years. the advertising manager knows that the store wants to run a sale circular the sunday before the holiday and that the art department needs the material 24 days before the ad is to run. by changing the year in cell b2, the advertising manager can create a new schedule for each year. to help the advertising manager, follow these steps: 1. ensure that the formula for each holiday starts as hlookup(b2,. this tells excel to use the year found in cell b2 as the value to look up.us ing power ful funct ions: logi cal , lookup, and database funct ions 312 2 part 2. ensure that the lookup table is in g3:l8. excel looks through the first row of this table to find the matching year. 3. when the matching column is found, you want excel to return the date for easter. although this is in row 5 of the worksheet, it is in the third row of the table, so ensure that the third parameter for the function is 3. 4. your years are already sorted left to right, but if you use a value of true for the fourth parameter, this causes problems in the year 2014, so make the fourth parameter false. ensure that the for-mula in cell b6 is hlookup(b2,g3:l8,3,false). 5. copy this formula to cell b11 and edit the formula to change the third parameter from 3 to 4. using the match function to locate the position of a matching value at first glance, match seems like a function that would rarely be useful. match returns the relative position of an item in a range that matches a specified value in a specified order. you use match instead of one of the lookup functions when you need the position of an item in a range instead of the item itself. suppose that your manager asks, “can you tell me on which row i would find this value?” the man-ager wants to know the value or some piece of data on that record. however, the manager rarely wants to know that xyz is found on the 111th relative row within the range a99:a11432. match comes in handy in several instances. in the first instance, consider a situation in which you are using vlookup to find whether an item is in a list. in this case, you do not care what value is returned. you are either interested in seeing if a valid value is returned, meaning that the entry is in the old list, or if an #n/a is returned, meaning that the entry is new. in this case, using match is a slightly faster way to achieve the same result. another handy way to use match is in conjunction with the index function. match has two features that make it more versatile than vlookup. match allows for wildcard matches. match also allows for a search based on an exact match, based on the number just below the value, or based on a value greater than or equal to the lookup value. this third option is not available in the vlookup or hlookup functions. syntax: match( lookup_value,lookup_array,match_type ) the match function returns the relative position of an item in a column of values. it is useful for determining if a certain value exists in a list. the match function takes the following arguments: • lookup_value—this is the value you use to find the value you want in a table. lookup_value can be a value, which is a number, text, or logical value or a cell reference to a number, text, or logical value. • lookup_array—this is a contiguous range of cells that contains possible lookup values. lookup_array can be an array or an array reference.313 examples of lookup and reference funct ions 12 chapter • match_type—this is the number -1, 0, or 1. note that you can use true instead of 1 and false instead of 0. match_type specifies how microsoft excel matches lookup_value with values in lookup_array. if match_type is 1, match finds the largest value that is less than or equal to lookup_value. lookup_array must be placed in ascending order, such as ... -2, -1, 0, 1, 2,...; a–z; or false, true. if match_type is 0, match finds the first value that is exactly equal to lookup_value. lookup_array can be in any order. if match_type is -1, match finds the small-est value that is greater than or equal to lookup_value. lookup_array must be placed in descending order, such as true, false; z–a; or ...2, 1, 0, -1, -2,.... if match_type is omitted, it is assumed to be 1. match returns the position of the matched value within lookup_array, not the value itself. for example, match(“b”,{“a”,”b”,”c”},0) returns 2, the relative position of b within the array {“a”,”b”,”c”}. match does not distinguish between uppercase and lowercase letters when matching text values. if match is unsuccessful in finding a match, it returns an #n/a error. if match_type is 0 and lookup_value is text, lookup_value can contain the wildcard characters asterisk ( *) and question mark ( ?). an asterisk matches any sequence of characters; a question mark matches any single character. using match to compare two lists you may face situations in which you have two versions of a list, and you need to match them up. in figure 12.25, the worksheet has two simple lists. column a shows last week’s list. column c shows this week’s version of the list. you want to find out quickly which items are new. here’s how you do it: figure 12.25 match operates slightly more quickly than vlookup and achieves the same result in this special case where you are trying to figure out whether a value is in another list.us ing power ful funct ions: logi cal , lookup, and database funct ions 314 2 part 1. add the heading there? to cell d2. 2. because the formula in cell d3 looks at the value in cell c3 to see if that value is in the original list in column a, start the formula with match(c3,a3:a11,. 3. because you want an exact match, use 0 as the third parameter. finish the function with a ). press ctrlenter to accept the formula and stay in cell c3. 4. double-click the fill handle to copy the formula down to all rows. for any cells where column d contains a number, it means that the entry was on the original list from last week. if the result of match is #n/a, you know that this item is new since the previous week. using index and match for a left lookup index is another function that does not immediately seem to have many great uses. in its basic form, index returns the cell from a particular row and column of a rectangular range. as shown in figure 12.26, using index(b5:d9,3,2) seems like a needlessly complicated way to refer to cell c7. figure 12.26 on its own, index is not a particularly useful function. however, in the previous section you learned about a function that searches through a range and tells you the position of the match within the range. finding the position of a match is not very use-ful. however, finding the position of a match is very useful when used inside of the index function. in figure 12.27, a customer number is entered in cell a1. the customer lookup table appears in columns f, g, and h. the main problem is that the customer table does not have the customer num-ber on the left side. in many cases, you would copy column h to column e and use column e as the key of the table. however, the table in f:h is likely to be repopulated every day from a web query or an olap query. therefore, it might become monotonous to move the data after every refresh. the solution is to use a combination of index and match. here’s what you do:315 examples of lookup and reference funct ions 12 chapter 1. use the formula match(b1,h2:h89,0) to search through column h to find the row with the customer number that matches the one in cell b1. in this case, c593 is in row 12, which is the 11th row of the table. 2. be sure to use exactly the same shape range as the first argument in the index function: inde x(f2:f89,whichrow,whichcolumn) searches through the customer names in column f. 3. for the second parameter of the index function, specify the relative row number. this informa-tion was provided by the match function in step 1. 4. ensure that the third parameter of the index function is the relative column number. because the range f2:f89 has only one column, this is either 1 or it can simply be omitted. 5. putting the formula together, the formula in cell b2 is index(f2:f89,match(b1,h 2:h89,0),1). figure 12.27 this combination of index and match allows you to look up data that is to the left of a key field. syntax: index( array,r ow_num, [ column_num ] ) the index function will return the value at the intersection of a particular row and column within a range. the index function takes the following arguments: • array—this is a range of cells or an array constant. if array contains only one row or column, the corresponding row_num or column_num argument is optional. if array has more than one row and more than one column, and if only row_num or column_num is used, index returns an array of the entire row or column in array. • row_num—this selects the row in array from which to return a value. if row_num is omitted, then column_num is required. • column_num—this selects the column in array from which to return a value. if column_num is omitted, then row_num is required. if both the row_num and column_num arguments are used, index returns the value in the cell at the intersection of row_num and column_num.us ing power ful funct ions: logi cal , lookup, and database funct ions 316 2 part if you set row_num or column_num to 0, index returns the array of values for the entire column or row, respectively. to use values returned as an array, you use the index function as an array for-mula in a horizontal range of cells for a row and in a vertical range of cells for a column. to enter an array formula, you press ctrlshiftenter. row_num and column_num must point to a cell within array; otherwise, index returns a #ref! error. using match and index to fill a wide table the lookup functions vlookup, hlookup, and match can be very processor-intensive when the lookup table contains hundreds of thousands of rows. back in figure 12.23, excel had to do 96 vlookup functions. however, after excel figured out the position of item g598 in the lookup table for cell c6, it had to go back through exactly the same steps for cell d6, e6, f6, g6, and so on. you made excel find exactly the same item 12 times, which is a very slow process. if the recalculation times are taking too long, you should consider using one match per row to find the relative row number and then using 12 speedy index functions to fill in the values in that row. figure 12.28 illustrates a problem where you can use this trick. in this case, the list of inventory items is 14,000 rows. here’s what you do: 1. copy the range of warehouse names from b20:m20 to d5:o5. 2. in cell c6, enter match(b6,a21:a14060,0). this formula finds an exact match for c529. the answer 8005 means that product c529 is on the 8,005th relative row of the lookup range. 3. copy the formula in cell c6 to c6:c13. 4. as you build the index function, be careful that the array range encompasses the same rows used in the match function. start the formula in cell d6 as index(b21:m14060. make sure to press f4 to make this reference absolute. 5. make the next argument the relative row number within the lookup range. this is the value from column c, so use c6. if you type c6 and then press the f4 key three times, excel adds the dol-lar sign before the c in c6. 6. add the final argument, the column number. for the first warehouse, this would be column 1. however, rather than typing a 1 for the formula, use column(a1). this allows you to copy the formula to the rest of the range. finish the formula with a parenthesis. the final formula is ind ex(b21:m14060,c6,column(a4)). note that it is not important if you use column(a1), column(a4), or column(a10000). all of those will return the number 1. 7. optionally, add a conditional format to cell d6 to highlight the cell if this formula evaluates to true: d6a6. 8. copy the formula from cell d6 to d6:o13.317 examples of lookup and reference funct ions 12 chapter performing many lookups with lookup even excel help tells you to avoid the old lookup function. however, lookup can do one useful trick that vlookup and hlookup cannot do—it can process many lookups in one single array for-mula. lookup can also deal with a lookup range that is vertical and a return range that is horizontal, or vice-versa. the next section looks at the common use of lookup and how it contrasts with vlookup or hlookup. syntax: lookup( lookup_value, array ) in this case, lookup is acting similar to vlookup or hlookup. excel examines the height and width of the array. if the array has more rows than columns, excel assumes you are doing a vlookup and looks through the first column of the array for the lookup value. if the array has more columns than rows, excel assumes you are doing an hlookup and looks through the first row of the array for the lookup value. in this syntax of lookup, excel always returns the value from the last column or row of the array. in figure 12.29, the formula in b2 is returning a value from cell g3. because the array is described as e2:g5, excel automatically returns a value from the final column of e2:g5. because the array is four rows and three columns, excel assumes you want the equivalent of vlookup instead of hlookup. in cell b3, the lookup array is d7:g8. because this array is wider than it is tall, cell b3 does the equivalent of an hlookup. in addition, lookup always performs a range lookup, similar to leaving off the false as the fourth parameter of vlookup or hlookup. for this reason, your lookup array must always be sorted. if you do not want to return a value from the last column of the array, you can specify two vectors in the alternative form of the syntax discussed in the next section. match function index functions figure 12.28 this performs eight relatively slow match functions and then 96 relatively fast index func-tions.us ing power ful funct ions: logi cal , lookup, and database funct ions 318 2 part syntax: lookup( lookup_value, lookup_vector, result_vector ) in this version of the lookup function, you specify vectors that are either one row tall or one column wide. this version allows you to do a lookup similar to vlookup where the result field is to the left of the key field. in cell b4 of figure 12.29, the result vector is to the left of the lookup vector. figure 12.29 the quirky lookup function decided to do a vlookup or hlookup depending on the shape of the lookup array. so far, everything about lookup can be accomplished using vlookup, hlookup, or index and match. however, a useful trick makes lookup better than those other functions: you can ask excel to look up many values at one time, provided that you do the following: 1. press ctrlshiftenter to accept the formula.. 2. enclose the lookup in a wrapper function such as sum to summarize all the results from the func-tion. in figure 12.30, a series of invoices appear in rows 4 through 17. a gp% (gross profit percentage) is associated with each invoice. the sales rep will earn a bonus depending on the gp% of each invoice as shown in e6:f10. instead of calculating a bonus for each row, you can calculate a bonus for all the rows at once. the formula in b1 of figure 12.30 specifies an array of b4:b17 as the lookup value. this causes excel to perform the lookup 14 times, once for each value in the range b4:b17. the formula wraps the lookup results in a sum function to add up all the bonus results. to calculate cor-rectly, you must hold down ctrlshift while pressing enter after typing this formula. using functions to describe the shape of a contiguous reference four functions can be used to identify the location and shape of a contiguous range: • column(reference)—this returns the column number of the upper-left corner of a reference, using numbers from 1 to 16,384. if reference is omitted, the function returns the column num-ber of the cell where the formula is entered. • row(reference)—this returns the row number of the upper-left corner of the reference, using numbers from 1 to 1,045,876. if reference is omitted, the function returns the row number of the cell where the formula is entered.319 examples of lookup and reference funct ions 12 chapter • columns(reference)—this returns the number of columns in a reference. in this case, refer-ence must be a single contiguous range. • rows(reference)—this returns the number of rows in a reference. again, reference must be a single contiguous range. figure 12.31 displays the row, column, rows, and columns functions of a named range. the range occupies the black cells in b7:d11. figure 12.30 unlike vlookup and hlookup , the aging lookup function can process many lookups in a single array formula. figure 12.31 these functions describe the location and shape of a range.us ing power ful funct ions: logi cal , lookup, and database funct ions 320 2 part using areas and index to describe a range with more than one area all the functions listed in the preceding section fail if the reference describes a noncontiguous range. however, you can check for that condition by using the areas function. syntax: areas( reference ) this function returns the number of contiguous ranges in a reference. the argument reference usu-ally refers to a named range. in figure 12.32, myareas is a defined name that describes the cells in black. in rows 1 through 4, all the traditional functions fail with #ref! errors because the reference contains more than one con-tiguous range. syntax: index( reference,row_num,column_num,area_num ) if you need to determine the location and shape of each contiguous range, do so one area at a time. a second syntax for the index function returns a reference to one specific area of a reference. this syntax includes the following arguments: • reference—reference to one or more cell ranges. if you are entering a nonadjacent range for the reference, enclose the reference in parentheses. if each area in reference contains only one row or column, the row_num or column_num argument, respectively, is optional. for exam-ple, for a single row reference, you use index(reference,column_num). • row_num—the number of the row in reference from which to return a reference. • column_num—the number of the column in reference from which to return a reference. • area_num—selects a range in reference from which to return the intersection of row_num and column_num. the first area selected or entered is numbered 1, the second is 2, and so on. if area_num is omitted, index uses area 1. for example, if reference describes the cells (a1:b4,d1:e4,g1:h4), then area_num 1 is the range a1:b4, area_num 2 is the range d1:e4, and area_num 3 is the range g1:h4. after reference and area_num have selected a particular range, row_num and column_num select a particular cell: row_num 1 is the first row in the range, column_num 1 is the first column, and so on. the reference returned by index is the intersection of row_num and column_num. if you set row_num or column_num to 0, index returns the reference for the entire column or row, respectively. row_num, column_num, and area_num must point to a cell within reference; otherwise, index returns a #ref! error. if row_num and column_num are omitted, index returns the area in refer-ence specified by area_num.321 examples of lookup and reference funct ions 12 chapter the result of the index function is a reference, and it is interpreted as such by other formulas. depending on the formula, the return value of index may be used as a reference or as avalue. for example, the formula cell(“width”,index(a1:b2,1,2)) is equivalent to cell(“width”,b1). the cell function uses the return value of index as a cell reference. on the other hand, a formula such as 2 *index(a1:b2,1,2) translates the return value of index into the number in cell b1. using this version of index, you can build formulas that work on one particular area in a named range. here’s how you do it: 1. in b15:e15, enter the numbers 1 through 4. these correspond to the four areas in myareas. 2. when you build the index function, you want excel to return a reference to the entire rows and columns of the first area of the range, so use index(myareas,,,1) to return such a reference. 3. instead of using 1 for the areas argument of index, use index(myareas,,,b15). 4. enter the formula column(index(myareas,,,b15)) in cell b16 to define the starting column of area 1 of myareas. 5. copy the formula from step 4 to b17:b20. edit each function to change column to row, columns, rows, and areas. 6. copy b17:b20 to columns c, d, and e. figure 12.32 to describe a reference with multiple contigu-ous ranges, you have to use the reference form of the index function.us ing power ful funct ions: logi cal , lookup, and database funct ions 322 2 part the result, as shown in figure 12.32, includes four sets of formulas in b16:e20 that completely describe the four areas of the named range myareas. using numbers with offset to describe a range the language of excel is numbers. there are functions that count the number of entries in a range. there are functions that can tell you the numeric position of a looked-up value. you may know that a particular value is found in row 20, but what if you want to perform calculations on other cells in row 20? the offset function handles this very situation. you can use offset to describe a range using mostly numbers. offset is flexible: it can describe a single cell, or it can describe a rectangular range. although index can return a single cell from a rectangular range, it has limitations. if you specify c5:z99 as the range for an index function, you can select only cells below and/or to the right of c5. the offset function can move up and down or left and right from the starting cell, which is c5. syntax: offset( reference,rows,cols,height,width ) the offset function returns a reference to a range that is a given number of rows and columns from a given reference. the offset function takes the following arguments: • reference—this is the reference from which you want to base the offset. reference must be a reference to a cell or range of adjacent cells; otherwise, offset returns a #value! error. • rows—this is the number of rows, up or down, that you want the upper-left cell to refer to. using 5 as the rows argument, for example, specifies that the upper-left cell in the reference is five rows below reference. rows can be positive, which means below the starting reference, or negative, which means above the starting reference. • cols—this is the number of columns to the left or right that you want the upper-left cell of the result to refer to. for example, using 5 as the cols argument specifies that the upper-left cell in the reference is five columns to the right of reference. cols can be positive, which means to the right of the starting reference, or negative, which means to the left of the starting reference. if rows and cols offset reference over the edge of the worksheet, offset returns a #ref! error. figure 12.33 demonstrates various combinations of rows and cols from a starting cell of cell c5. • height—this is the height, in number of rows that you want the returned reference to be. height must be a positive number. • width—this is the width, in number of columns that you want the returned reference to be. width must be a positive number. if height or width is omitted, excel assumes it is the same height or width as reference.323 examples of lookup and reference funct ions 12 chapter offset allows you to specify a reference. it does not move any cell. it does not change the selec-tion. it is just a numeric way to describe a reference. offset can be used in any function that is expecting a reference argument. excel help provides a trivial example of sum(offset(c2,1,2,3,1)), which sums e3:e5. however, this example is silly because no one would ever write such a formula! if you were to write such a formula, you would just write sum(e3:e5) instead. the power of offset comes when at least one of the four numeric arguments is calculated by the count function or a lookup function. in figure 12.34, you can use count(a5:a99) to count how many entries are in column a. if you assume that there are no blanks in the range of data, you can use the count result as the height argument in offset to describe the range of numbers. here’s what you do: 1. there is nothing magic about the reference, so write it as offset(a5,. 2. do not move the starting position any rows or columns from cell a5. the starting posi-tion is a5, so you always use 0 and 0 for rows and columns. therefore, the formula is now offset(a5,0,0,. 3. if you want to include only the number of entries in the list, use count(a5:a999) as the height of the range. the formula is now offset(a5,0,0,count(a5:a999),. 4. the width is one column, so make the function offset(a5,0,0,count(a5:a999),1). 5. use your offset function anywhere that you would normally specify a reference. you can use sum(offset(a5,0,0,count(a5:a999),1)) or specify that formula as the series in a chart. this creates a dynamic chart that grows or shrinks as the number of entries changes. figure 12.33 these offset functions return a single cell that is a certain number of rows and columns away from cell c5.us ing power ful funct ions: logi cal , lookup, and database funct ions 324 2 part for a more complex example of offset, examine figure 12.35, which shows several yearly tables starting in cell c8. each month of the table contains from one to five entries. the person using this spreadsheet will select a year and a month from cells e1 and e2. the goal is to find information about the entries for that particular month and year. here’s how you do it: 1. have the formula in cell i1 find the starting row for the particular year, using the match function shown in cell j1. 2. have the formula in cell i2 find the column for the chosen month, using the match function shown in cell j2. 3. build the offset function to describe the range for that month and year. you know that it starts in the row in i1 and the column in i2. if you make the reference cell a1, then row 15 is 14 rows below a1. therefore, use offset(a1,i1-1,. 4. the starting column is in cell i2. column 8 is seven columns to the right of a1. therefore, you now use offset(a1,i1-1,i2-1. 5. the structure of the worksheet allows for up to five entries per month, arranged down a row. thus, height is 5 and width is 1. use the following formula to describe the possible range for the month: offset(a1,i1-1,i2-1,5,1). this is good enough to use for min, max, sum, and so on. 6. to chart the data, figure out the exact height. use the count(offset(a1,i1-1,i2-1,5,1)) formula in cell i3 to count the number of entries for the month. 7. use the formula offset(a1,i1-1,i2-1,i3,1) to describe the exact month. add additional formulas in i4:i6 to figure out the minimum, maximum, and sum of those cells. figure 12.34 every argument except height is hard-coded in these functions. the height argument comes from a count func-tion to allow the range to expand as more entries are added.325 examples of lookup and reference funct ions 12 chapter the offset function initially seems intimidating, especially in light of the example you just walked through. remember that for useful results from offset, you usually replace one or more of the final four arguments with a calculation. using address to find the address for any cell if someone asks you for the cell address for the cell in row 5, column 5, you could probably come up with e5 quickly. what if someone asks you for the cell address of the cell in row 26, column 26? this is z26. again, you should come up with this if you know there are 26 letters in the alphabet. if someone asks you to calculate the address of row 2 and column 30, you have to divide 30 by 26 to learn that the result is 1 with a remainder of 4. this could lead you to conclude the cell address is the first letter of the alphabet—a—and the fourth letter of the alphabet—d—to come up with ad2. this type of calculation becomes far more complex with 16,384 columns. for example, how would you calculate the address for row 2 of column 14123? fortunately, excel provides the address function to convert any intersection of row and column number to an address. address(2,14123) returns the text of twe2. syntax: address( row_num,column_num,abs_num,a1,sheet_text ) the default version of address returns the cell address as an absolute address with both dollar signs. there are optional parameters to control this behavior: figure 12.35 even with a poorly designed database spread-sheet, various combinations of offset can locate and total cells for a specific month.us ing power ful funct ions: logi cal , lookup, and database funct ions 326 2 part • row_num—this is the row number to use in the cell reference. • column_num—this is the column number to use in the cell reference. • abs_num—this specifies the type of reference to return. if it is 1 or omitted, the returned address has both dollar signs and is absolute. if it is 2, the row is held absolute, but the column is rela-tive. if it is 3, the row is relative and the column is absolute. if it is 4, the address is relative, with no dollar signs. • a1—this is a logical value that specifies the a1 or r1c1 reference style. if a1 is true or omitted, address returns an a1-style reference; if it is false, address returns an r1c1-style reference. • sheet_text—this is text that specifies the name of the worksheet to be used as the external reference. if sheet_text is omitted, no sheet name is used. figure 12.36 shows eight ways to describe one cell, depending on the various combinations of absolute and a1 arguments. the sheet_text argument is interesting. it is difficult to remember the arcane rules for when to use apostrophes and where the exclamation point needs to go in an address. if you specify sheet_text as the name of a worksheet or use the style [book_name]sheetname, excel builds the proper reference. cell b11 in figure 12.36 shows the result from an address function that builds a reference to another workbook. tip to find the value of a cell described by address , use the indirect function. figure 12.36 address can return a cell address in a1 or r1c1 style. using indirect to build and evaluate cell references on-the-fly the indirect function is deceivingly powerful. consider this trivial example: in cell a1, enter the text b2. in cell b2, enter a number. in cell c3, enter the formula, indirect(a1). excel will return the number that you entered in cell b2 in cell c3. the indirect function looks in cell a1327 examples of lookup and reference funct ions 12 chapter and expects to find something that is a valid cell or range refe rence. it then looks in that address to return the answer for the function. the reference text can be any text that you can string together using various text functions. this allows you to create complex references that dynamically point to other sheets or to other open workbooks. the reference text can also be a range name. you could have a validation list box where someone selects a value from a list. if you have predefined a named range that corresponds to each possible entry on the list, indirect can point to the various named ranges on-the-fly. when you use traditional formulas, even absolute formulas, there is a chance that someone might insert rows or columns that will move the reference. if you need a formula to always point to cell j10, no matter how someone rearranges the worksheet, you can use indirect(“j10”) to handle this. syntax: indirect( ref_text,a1 ) the indirect function returns the reference specified by a text string. the indirect function takes the following arguments: • ref_text—this is a reference to a cell that contains an a1-style reference, an r1c1-style refer-ence, a name defined as a reference, or a reference to a cell as a text string. if ref_text is not a valid cell reference, indirect returns a #ref! error. if ref_text refers to an external workbook, the other workbook must be open. if the source workbook is not open, indirect returns a #ref! error. • a1—this is a logical value that specifies what type of reference is contained in the cell ref_ text. if a1 is true or omitted, ref_text is interpreted as an a1-style reference. if a1 is false, ref_text is interpreted as an r1c1-style reference. figure 12.37 is a monthly worksheet in a workbook that has 12 similar sheets. in each worksheet, the data headings are in row 6, and the invoices appear for some number of rows, starting in row 7. each worksheet has a total for the month in cell d2. figure 12.37 you can add a year-to-date formula to all sheets.us ing power ful funct ions: logi cal , lookup, and database funct ions 328 2 part in this example, you want to add a year-to-date total in cell d3 on each worksheet. this is fairly dif-ficult to do without vba. many vba books include a user-defined function to describe the previous sheet in a workbook. however, this function will fail if you send the workbook to someone who dis-ables macros or her computer. instead, you can solve this problem with clever use of text functions and the indirect function. to do so, follow these steps: 1. select the jan worksheet. 2. shiftclick the dec worksheet to put all 12 sheets in group mode. 3. in cell a1, enter the formula a7. this adds the first date as a title for the worksheet. 4. format cell a1 with the custom format mmmm, yyyy. this causes the date to appear as january, 2010. 5. right-click the jan tab name and select ungroup sheets. 6. enter d2 as the year-to-date formula in cell d3 of the jan tab. 7. on the feb worksheet, build a text formula that returns the name of the previous month. the quest becomes how to build a formula that looks like jan!d3. 8. jan is a three-letter abbreviation for any date in the month of january. therefore, enter a january date in a cell and format the cell with the custom number format mmm, so that the result is the word jan. 9. the text function takes a number or date and displays it using a specific custom number format, so on the february sheet, use text(a1,”mmm”), which results in the value feb. this is close. if you can find a way to get the name of the previous month, the problem will be solved. 10. the value in cell a1 is a live date. you can use date math to calculate a different date, such as the date one month earlier. use the date(year,month,day) function to return a date in the previ-ous month. for the year parameter, use year(a1). for the month parameter, use month(a1)–1. for the day parameter, use day(a1). the formula date(year(a1),month(a1)–1,1) returns a date that is the first of the previous month. 11. combining steps 9 and 10 into a single formula, use text(date(year(a1),month(a1)– 1,1),”mmm”) to return the value of jan on the feb worksheet, feb on the march worksheet, and so on. 12. use the generic formula text(date(year(a1),month(a1)–1,1),”mmm”)&”!d3” to build the reference. 13. select the feb worksheet. shiftclick the dec worksheet to place these 11 worksheets in group mode. in cell d3, enter this formula: indirect(text(date(year(a1),month(a1)– 1,1),”mmm”)&”!d3”)d2. 14. right-click any sheet tab and select ungroup to take the workbook out of group mode. the result, as shown in figure 12.38, is a formula on the last 11 worksheets that automatically pulls the year-to-date total from the previous worksheet and adds it to the current worksheet total.329 examples of lookup and reference funct ions 12 chapter using the hyperlink function to quickly add hyperlinks excel enables you to add a hyperlink by using the excel interface. on the insert tab, select the hyperlink icon. next, you specify text to appear in the cell and the underlying address. building links in this way is easy, but it is tedious to build them one at a time. if you have hundreds of links to add, you can add them quickly by using the hyperlink function. syntax: hyperlink( link_location,friendly_name ) the hyperlink function creates a shortcut that opens a document stored on your hard drive, a net-work server, or on the internet. the hyperlink function takes the following arguments: • link_location—this is the url address on the internet. it could also be a path, filename, and location in another file. for example, you could link to “[c:\files\jan2007. xls]!sheet1!a15”. note that link_location can be a text string enclosed in quotes or a cell that contains the link. • friendly_name—this is the underlined text or numeric value that is displayed in the cell. friendly_name is displayed in blue and is underlined. if friendly_name is omitted, the cell displays the link_location value as the jump text. friendly_name can be a value, a text string, a name, or a cell that contains the jump text or value. if friendly_name returns an error (for example, #value!), the cell displays the error instead of the jumptext. figure 12.39 shows a list of web pages in column a. column b contains the titles of those web pages. to quickly build a table of figure 12.38 cell d4 dynamically builds a text formula to ref-erence the previous sheet, and then indirect evaluates the formula. note note that excel does not check whether the link location is valid at the time you created the link. if the link is not valid when someone clicks it, the person encounters an error. tip it is difficult to select a cell that contains a hyperlink func-tion. if you click the cell, excel attempts to follow the hyperlink. instead, you should click a cell near the cell and then use the arrow keys to move into the cell.us ing power ful funct ions: logi cal , lookup, and database funct ions 330 2 part hyperlinks, you use hyperlink(a2,b2) in cell c2 and copy the formula down the column. after the hyperlinks are created, you can copy column c and use paste values on column c. you are then free to delete columns a and b. figure 12.39 the formulas in column c allow you to create hun-dreds of hyperlinks in seconds. using the transpose function to formulaically turn data with many people using excel in a company, there are bound to be different usage styles from per-son to person. some people build their worksheets horizontally, and other people build their work-sheets vertically. for example, in figure 12.40, the monthly totals stretch horizontally across row 80. however, for some reason, you need these figures to be arranged going vertically down from cell b84. the typical method is to copy c80:n80 and then use home, paste, transpose. this copies a snap-shot of the totals in row 80 to a column of data. this is fine if you only need a snapshot of the totals. however, what if you want to see the totals continually updated in column b? excel provides the transpose function for such situations. figure 12.40 turning c80:n80 into a vertical range is called transposing the data.331 syntax: 12 chapter because the function returns several answers, you need to use special care when entering the for-mula. here’s how: 1. note that c80:n80 contains 12 cells. 2. select an identical number of cells starting in b84. select b84:b95. 3. even though you have 12 cells selected, type the formula transpose(c80:n80) as if you had only one cell selected. 4. to tell excel that this is a special type of formula called an array formula, hold down ctrlshift while you press enter. excel shows the formula surrounded by curly braces in the formula bar. this is one single formula entered in 12 cells. therefore, you cannot delete or change one cell in the range. if you want to change the formula, you need to delete all 12 cells in b84:b95 in a single command. figure 12.41 shows a transpose function that occupies 12 cells. figure 12.41 one transpose function occupies 12 cells, from b84:b95. syntax: transpose( array ) the transpose function transposes a vertical range into a horizon-tal array, or vice versa. the argument array is an array or a range of cells on a worksheet that you want to transpose. the transposition of an array is accom-plished by using the first row of the array as the first column of the new array, the second row of the array as the second column of the new array, and so on. note you can also use transpose to turn a vertical range into a hori-zontal range.us ing power ful funct ions: logi cal , lookup, and database funct ions 332 2 part using the rtd function and com add-ins to retrieve real-time data third-party applications are available to send streaming real-time data to an excel spreadsheet. they became very popular with stock day traders back in the late 1990s. if you have one of these com add-ins installed on your system, you can set up a formula to retrieve real-time data from the com add-in by using the rtd function. if you have such a com add-in installed, the vendor of the add-in should provide sample workbooks with rtd functions already in place. syntax: rtd( progid,server,topic1,[topic2],...) the rtd function returns real-time data from a program that supports com automation. the rtd function takes the following arguments: • progid—this is the name of the program id of a registered com automation add-in that has been installed on the local computer. you need to enclose the name in quotation marks. • server—this is name of the server where the add-in should be run. if there is no server and the program is run locally, leave this argument blank. • topic1, topic2,...—these are 1 to 28 parameters that together represent a unique piece of real-time data. using getpivotdata to retrieve one cell from a pivot table you might turn to this book to find out how to use most of the functions. however, for the getpivotdata function, you are likely to turn to this book to find out why the function is being automatically generated for them. suppose that you have a pivot table on a worksheet. you should click outside the pivot table. next, you type an equal sign and then with the mouse, click one of the cells in the data area of the pivot table. although you might expect this to generate a formula such as e9, instead, excel puts in the formula getpivotdata(“sales”,b5,”customer”,”astonishing glass company”,”region”,”west”), as shown in figure 12.42. this function is annoying. as you copy the formula down to more rows, the function keeps retriev-ing sales to astonishing glass in the west region. by default, excel is generating this function instead of a simple formula such as e9. this happens whether you use the mouse or the arrow keys to specify the cell in the formula. to avoid this behavior, you can type the entire formula by manually typing it on the keyboard. typing e9 in a cell forces excel to create a relative reference to cell e9. you are then free to copy the formula to other cells. there is also a way to turn off this behavior permanently: 1. select a cell inside an active pivot table.333 syntax: 12 chapter 2. the pivot table tools tabs appears. select the options tab. from the pivottable group, select the options drop-down and then select the generate getpivotdata icon (see figure 12.43). the behavior turns off. figure 12.42 excel inserts this strange function in the worksheet. figure 12.43 you can disable the getpivotdata function option. 3. enter formulas by using the mouse, arrow keys, or keyboard without generating the getpivotdata function. microsoft made getpivotdata the default behavior because the function is pretty cool. now that you have learned how to turn off the behavior, you might want to understand exactly how it works in case you ever need to use the function. syntax: getpivotdata( data_field,pivot_table,field1,item1,field2,item2,... ) the getpivotdata function returns data stored in a pivot table report. you can use getpivotdata to retrieve summary data from a pivot table report, provided that the summary data is visible in the report. this function takes the following arguments: • data_field—this is the name, enclosed in quotation marks, for the data field that contains the data you want to retrieve.us ing power ful funct ions: logi cal , lookup, and database funct ions 334 2 part • pivot_table—this is a reference to any cell, range of cells, or named range of cells in a pivot table report. this information is used to determine which pivot table report contains the data you want to retrieve. • field1, item1, field2, item2,...—these are 1 to 14 pairs of field names and item names that describe the data you want to retrieve. the pairs can be in any order. field names and names for items other than dates and numbers are enclosed in quotation marks. for olap pivot table reports, items can contain the source name of the dimension as well as the source name of the item. calculated fields or items and custom calculations are included in getpivotdata calculations. if pivot_table is a range that includes two or more pivot table reports, data is retrieved from whichever report was created in the range most recently. if the field and item arguments describe a single cell, the value of that cell is returned, regardless of whether it is a string, a number, an error, and so on. if an item contains a date, the value must be expressed as a serial number or populated by using the date function so that the value is retained if the spreadsheet is opened in a different locale. for example, an item referring to the date march 5, 1999, could be entered as 36224 or date(1999,3,5). times can be entered as decimal values or by using the time function. if pivot_table is not a range in which a pivot table report is found, getpivotdata returns #ref!. if the arguments do not describe a visible field, or if they include a page field that is not displayed, getpivotdata returns # ref!. examples of database functions if you were a serious data analyst in the 1980s and the early 1990s, you would have been enamored with the database functions. i personally used @dsum every hour of my work life for many years. it was one of the most powerful weapons in any spreadsheet arsenal. combined with a data table, the dsum, dmin, dmax, and daverage functions got a serious workout when users performed data analysis in a spreadsheet. then, in 1993, microsoft excel added the pivot table to the data menu in excel. pivot tables changed everything. those powerful database functions seemed tired and worn out. since that day in 1993, i had never used dsum again until i created the example described in the following section. as far as i knew, the database functions had been living in a cave in south carolina. maybe it is like the nostalgia of finding a box of photos of an old girlfriend, but i realize that the database functions are still pretty powerful. customers whined enough to have microsoft add averageif to the countif and sumif arsenal. this was unnecessary: customers could have done this easily by setting up a small criteria range and using daverage. eleven of the 12 database functions are similar. dsum, daverage, dcount, dcounta, dmax, dmin, dproduct, dstdev, dstdevp, dvar, and dvarp all perform the equivalent operation of their non-d equivalents, but they allow for complex criteria to include records that meet certain cri-teria.335 examples of database funct ions 12 chapter to save you the hassle of looking up the confusing few, dcount counts numeric cells, and dcounta counts nonblank cells. dstdev and dvar calculate the standard deviation and variance of a sample of a population. dstdevp and dvarp calculate the standard deviation and variance of the entire population. the 12th database function, dget, has the same arguments, but it acts a bit differently, as explained later in this chapter. using dsum to conditionally sum records from a database there are three arguments to every database function. it is very easy to get your first dsum working. the criteria argument is the one that offers vast flexibility. the following section explains the syntax for dsum. the syntax for the other 11 database functions is identical tothis. syntax: dsum( database,field,criteria ) the dsum function will add records from one field in a data set, provided that the records meet some criteria that you specify. the dsum function takes the following arguments: • database—this is the range of cells that make up the list or database, including the heading row. a database is a list of related data in which rows of related information are records and columns of data are fields. in figure 12.44, the database is the 5,000 rows of data located at a23:i5024. criteria range database figure 12.44 a simple cri-teria range specifies to limit dsum to only records for best paint inc. as a customer.us ing power ful funct ions: logi cal , lookup, and database funct ions 336 2 part • field—this indicates which column is used in the function. you have three options when specifying a field: • you can point to the cell with the field name such as h23 for revenue. • you can include the word revenue as the field argument. • you can use the number 8 to indicate that revenue is the eighth field in the database. • criteria—this is the range of cells that contains the conditions specified. you can use any range for the criteria argument. the criteria range typically includes at least one column label and at least one cell below the column label for specifying a condition for the column. you can also use the computed criteria discussed in “using the miracle version of the criteria range” later in this chapter. learning how to create powerful criteria ranges allows you to unlock the powerful potential of the database functions. several examples are provided in the following sec-tions. creating a simple criteria range for database functions although a criteria range needs only one field heading from the database, it is just as easy to copy the entire set of headings to a blank section of the worksheet. in figure 12.44, for example, the headings in a17:i17, along with at least one additional row, create a criteria range. in figure 12.44, you see results of the 11 database functions for a simple criteria where the customer is best paint inc. each formula specifies a database of a23:i5024. the field is h23, which is the heading for revenue. the criteria range is a17:i18. in this example, the criteria range could have easily been a17:a18, but the a17:i18 form allows you to enter future criteria without respeci-fying the criteria range. using a blank criteria range to return all records this is a trivial example, but if the second row of the criteria range is completely blank, the data-base function returns the total of all rows in the data set. as shown in figure 12.45, this is 256 mil-lion. this is equivalent to using the sum function. note to conserve space, the remain-ing examples in the following sections show only the dsum result. you can compare the vari-ous results to the 657,028 of revenue for the current example. figure 12.45 if the sec-ond row of the criteria is blank, the result reflects all rows.337 examples of database funct ions 12 chapter using and to join criteria many people using sumif in excel 2003 and earlier are likely to want to know how to conditionally sum based on two conditions. this is simple to do with dsum. if two criteria are placed on the same row of the criteria range, they are joined by an and. in figure 12.46, forexample, the 123,275 is the sum of records where the customer is best paint and the product is v937. figure 12.46 when two criterion are on the same line, they are joined by an and function; rows must meet both criteria to be included in the dsum . using or to join criteria when two criteria are placed on separate rows of the criteria range, they are joined by an or func-tion. in figure 12.47, the 2.1 million represents records for either improved radio traders or best paint. figure 12.47 when two criteria are on different rows, they are joined by an or function; rows can meet either criteria to be included in the dsum . you can use or to join criteria from different fields. the criteria range in figure 12.48 shows a region value of west joined by an or with a district value of texas. this pulls a superset of all the west records plus just the texas records which happen to fall in the central region. using dates or numbers as criteria the example in figure 12.49 finds records with a date in 2015 and with revenue under 50,000. this data set does not contain any records from 2016, so you only need to check for items beyond 2014. the criteria in f18 for the date could have used any of these formats:us ing power ful funct ions: logi cal , lookup, and database funct ions 338 2 part 12/31/2014 1/1/2015 31-dec-2014 figure 12.48 the criteria to be joined with or can be in separate columns. figure 12.49 using dates or numbers in criteria. using the miracle version of a criteria range using the criteria ranges in the preceding examples, you could easily build any complex criteria with multiple and or or operators. however, this could get complex. imagine if you wanted to pull all the records for five specific customers and five specific products. you would have to build a criteria range that is 26 rows tall. basically, the first row is the headings for customer and product. the second row indicates that you want to see records for customer1 and product1. the third row indicates that you want to see records for customer1 and product2. the fourth row indicates that you want to see records for customer1 and product3. the seventh row indicates customer2 and product1. the 26th row indi-cates customer5 and product5. if you need to pull the records for seven customers and seven products from five districts, your crite-ria range would grow to 246 rows tall and will probably never finish calculating. there is a miraculous version of the criteria range that completely avoids this problem. here’s how it works: • the criteria range consists of a range that is two cells tall and one or more cells wide. • contrary to instructions in excel help, the top cell of the criteria range cannot contain a field heading. the top cell must be blank or contain anything which does not match the database header row. for example, you could put a heading of “computed criteria.”339 examples of database funct ions 12 chapter • the second row in the criteria range can contain any formula that evaluates to true or false. this formula must point to cells in the first data row of the database. the formula can be as com-plex as you wish, with and, or, vlookup, not, and match; it can contain any combination of functions. for a simple example, suppose you want to find records that match 1 of 15 customers. you copy the customers to k24:k38. in the second row of the criteria field, write the formula not(isna(match (a24,k24:k38,0))). this formula does a match on the first customer in the database to see if it is in the list in k. the isna and not functions make sure that the criteria cell returns a true when the customer is 1 of the 15 customers. very quickly and without complaint, excel compares the 5,000 rows of your database with this com-plex formula, and the dsum produces the correct value, as shown in figure 12.50. to watch a video of dsum with this criteria range, search for “excel in depth 12” at youtube . using the dget function the dget function returns a single cell from a database. the problem is that this function is picky. if your criteria range matches zero records, dget returns a #value error. if your criteria range returns more than one row, dget returns a #num! error. to have dget work, you need to write a criteria record that causes one and only one row to be evaluated as true. figure 12.50 the formula version of the criteria range is rare but incred-ibly power-ful. syntax: dget( database,field,criteria ) the dget function returns a single cell matching criteria from a data set. the dget function takes the following arguments:us ing power ful funct ions: logi cal , lookup, and database funct ions 340 2 part • database—this is the range of cells that make up the list or database. a database is a list of related data in which rows of related information are records and columns of data are fields. the first row of the list contains labels for each column. • field—this indicates which column is used in the function. field can be given as text, with the column label enclosed between double quotation marks, such as “age” or “yield”, or as a num-ber that represents the position of the column within the list (for example, 1 for the first column, 2 for the second column, and so on). • criteria—this is the range of cells that contains the conditions you specify. you can use any range for the criteria argument, as long as it includes at least one column label and at least one cell below the column label for specifying a condition for the column. excel in practice: using dsum with a data table if you do not want to use a pivot table, you can do a crosstab analysis by using a combination of the dsum function and the data table command. the data table command works best when a problem is set up with two variables. in the dsum function, you might have two variables defined in the criteria range. to set up a two-variable table using the dsum function, follow these steps: 1. ensure that the upper-left corner of the table is a formula that relies on at least two vari-ables. in figure 12.51 , cell b1 contains a dsum that relies on the criteria ranges in a17:i18. 2. down the left side of the table, arrange a list of values that should be substituted for one variable. in this example, the column contains a list of products that will eventually be sub-stituted into cell b18. 3. across the top row of the table, arrange a list of values that should be substituted for the other variable. in this example, the row contains a list of regions that will eventually be substituted into cell c18. 4. select the range for the table. this selection should include the formula as the upper-left corner cell. it should also include the column and row of headings. 5. from the data tab, select what-if analysis, data table. the data table dialog appears, ask-ing for twocells. 6. for the row input cell, enter the cell where the regions should be substituted. in this case, it is cell c18 in the criteria range. 7. for the column input cell, enter the cell where the values down the left column will be substituted. in this case, it is cell b18 in the criteria range. the complete dialog box should look as shown in figure 12.52 . the result is a crosstab analysis that shows the dsum for every combination of product and region. excel actually creates a table array function to produce the answers. this is a live for-mula: if you change the product names or regions, the cells inside the table recalculate.341 chapter figure 12.51 the data table dialog requires two cells. figure 12.52 the resulting table provides a crosstab analysis similar to that in a pivot table.this page intentionally left blank13 using financial functions although the bulk of excel’s financial functions are for professional financiers and investors, a few functions are useful for anyone planning to use a loan to purchase a car or house. the examples in this chapter represent a small subset of the calculations possible with excel’s financial functions. the following financial functions use new algorithms in excel 2010: cumipmt—cumulative interest paid on a loan cumprinc—cumulative principal paid on a loan ipmt—interest payment for an investment irr—internal rate of return for a series of cash flows pmt —payment for a loan ppmt—payment on principal for an investment xirr—internal rate of return for a schedule of cash flows the improved algorithms often affect only fringe cases of the functions. you might find that many results are the same as in previous versions of excel. however, if the result in excel 2010 is different, it will always be more accurate than the result from previous versions. table 13.1 provides an alphabetical list of all of excel 2010’s financial functions. detailed examples of the functions are pro-vided in the remainder of the chapter. note some of excel’s financial func-tions have been updated to have new algorithms in excel 2010. be aware that any worksheets that use these functions might pro-duce different results than the same formula in excel 2007.us ing financ ial funct ions 344 2 part table 13.1 alphabetical list of financial functions function description accrint ( issue, first_ interest, rate, par, fre-quency, basis ) returns the accrued interest for a security settlement that pays periodic interest. accrintm ( issue, maturity, rate, par, basis ) returns the accrued interest for a security that pays interest at maturity. amordegrc ( cost, date_ purchased, first_period, salvage, period, rate, basis ) returns the depreciation for each accounting period. this func-tion is provided for the french accounting system. if an asset is purchased in the middle of the accounting period, the prorated depreciation is taken into account. the function is similar to amorlinc, except that a depreciation coefficient is applied in the calculation, depending on the life of the assets. amorlinc ( cost, date_ purchased, first_period, salvage, period, rate, basis ) returns the depreciation for each accounting period. this func-tion is provided for the french accounting system. if an asset is purchased in the middle of the accounting period, the prorated depreciation is taken into account. coupdaybs ( settlement, maturity, frequency, basis ) returns the number of days from the beginning of the coupon period to the settlement date. coupdays ( settlement, maturity, frequency, basis ) returns the number of days in the coupon period that contains the settlement date. coupdaysnc (settlement, maturity, frequency, basis) returns the number of days from the settlement date to the next coupon date. coupncd ( settlement, maturity, frequency, basis) returns a number that represents the next coupon date after the settlement date. to view the number as a date, you select format, cells and then click date in the category box. then click a date format in the type box. coupnum ( settlement, matu-rity, frequency, basis ) returns the number of coupons payable between the settlement date and maturity date, rounded up to the nearest whole coupon. couppcd (settlement, matu-rity, frequency, basis) returns a number that represents the previous coupon date before the settlement date. to view the number as a date, you select format, cells and then click date in the category box. then click a date format in the type box. cumipmt ( rate, nper, pv, start_period, end_period, type ) returns the cumulative interest paid on a loan between start_ period and end_period. cumprinc ( rate, nper, pv, start_period, end_period, type ) returns the cumulative principal paid on a loan between start_ period and end_period. db ( cost, salvage, life, period, month ) returns the depreciation of an asset for a specified period, using the fixed-declining balance method.345 us ing financ ial funct ions 13 chapter function description ddb ( cost, salvage, life, period, factor ) returns the depreciation of an asset for a specified period using the double-declining-balance method or some other specified method. disc ( settlement, matu-rity, pr, redemption, basis ) returns the discount rate for a security. dollarde ( fractional_dol-lar, fraction ) converts a dollar price expressed as a fraction into a dollar price expressed as a decimal number. use dollarde to convert frac-tional dollar numbers, such as securities prices, to decimal num-bers. dollarfr ( decimal_dollar, fraction ) converts a dollar price expressed as a decimal number into a dollar price expressed as a fraction. use dollarfr to convert decimal numbers to fractional dollar numbers, such as securities prices. duration ( settlement, maturity, coupon yld, fre-quency, basis ) returns the macaulay duration for an assumed par value of 100. the duration is defined as the weighted average of the pres-ent value of the cash flows and is used as a measure of a bond price’s response to changes in yield. effect ( nominal_rate, npery ) returns the effective annual interest rate, given the nominal annual interest rate and the number of compounding periods per year. fv ( rate, nper, pmt, pv, type ) returns the future value of an investment, based on periodic, constant payments and a constant interest rate. fvschedule ( principal, schedule ) returns the future value of an initial principal after applying a series of compound interest rates. use fvschedule to calculate future value of an investment with a variable or adjustable rate. intrate ( settlement, matu-rity, redemption, basis ) returns the interest rate for a fully investment invested security. ipmt ( rate, per, nper, pv, fv, type ) returns the interest payment for a given period for an invest-ment, based on periodic, constant payments and a constant inter-est rate. for a more complete description of the arguments in ipmt and for more information about annuity functions, see pv. irr ( values, guess ) returns the internal rate of return for a series of cash flows rep-resented by the numbers in values. these cash flows do not have to be even, as they would be for an annuity. however, the cash flows must occur at regular intervals, such as monthly or annu-ally. the internal rate of return is the interest rate received for an investment consisting of payments (negative values) and income (positive values) that occur at regular periods. ispmt ( rate, per, nper, pv ) calculates the interest paid during a specific period of an invest-ment. this function is provided for compatibility with lotus 1-2-3.us ing financ ial funct ions 346 2 part function description mduration ( settlement, maturity, yld, frequency, basis ) returns the modified duration for a security with coupon an assumed par value of 100. mirr ( values, finance_ rate, reinvest_rate ) returns the modified internal rate of return for a series of peri-odic cash flows. mirr considers both the cost of the investment and the interest received on reinvestment of cash. nominal ( effect_rate, npery ) returns the nominal annual interest rate, given the effective rate and the number of compounding periods per year. nper ( rate, pmt, pv, fv, type ) returns the number of periods for an investment, based on peri-odic, constant payments and a constant interest rate. npv ( rate, value1, value2 ,...) calculates the net present value of an investment by using a dis-count rate and a series of future payments (negative values) and income (positive values). oddfprice ( settlement, maturity, issue, first_ coupon, rate, yld, redemp-tion, frequency, basis ) returns the price per 100 face value of a security having an odd (short or long) first period. oddfyield ( settlement, maturity, issue, first_ coupon, rate, pr, redemp-tion, frequency, basis ) returns the yield of a security that has an odd (short or long) first period. oddlprice ( settlement, maturity, last_interest, rate, yld, redemption, frequency, basis ) returns the price per 100 face value of a security having an odd (short or long) last coupon period. oddlyield ( settlement, maturity, last_interest, rate, pr, redemption, frequency, basis ) returns the yield of a security that has an odd (short or long) last period. pmt ( rate, nper, pv, fv, type ) calculates the payment for a loan based on constant payments and a constant interest rate. ppmt ( rate, per, nper, pv, fv, type ) returns the payment on the principal for a given period for an investment based on periodic, constant payments and a constant interest rate. price ( settlement, matu-rity, rate, yld, redemp-tion, frequency, basis ) returns the price per 100 face value of a security that pays peri-odic interest. pricedisc ( settlement, maturity, redemption, basis ) returns the price per 100 face value of a discount discounted security.347 us ing financ ial funct ions 13 chapter function description pricemat ( settlement, maturity, rate, yld, basis ) returns the price per 100 face value of a issue security that pays interest at maturity. pv ( rate, nper, pmt, fv, type ) returns the present value of an investment. the present value is the total amount that a series of future payments is worth now. for example, when you borrow money, the loan amount is the present value to the lender. rate ( nper, pmt, pv, fv, type, guess ) returns the interest rate per period of an annuity. rate is calcu-lated by iteration and can have zero or more solutions. if the suc-cessive results of rate do not converge to within 0.0000001 after 20 iterations, rate returns a num! error. received ( settlement, maturity, investment, dis-count, basis ) returns the amount received at maturity for a fully invested security. sln ( cost, salvage, life ) returns the straight-line depreciation of an asset for one period. syd ( cost, salvage, life, per ) returns the sum-of-years’-digits depreciation of an asset for a specified period. tbilleq ( settlement, matu-rity, discount ) returns the bond-equivalent yield for a treasury bill (t-bill). tbillprice ( settlement, maturity, discount ) returns the price per 100 face value for a t-bill. tbillyield ( settlement, maturity, pr ) returns the yield for a t-bill. vdb ( cost, salvage, life, start_period, end_period, factor, no_switch ) returns the depreciation of an asset for any specified period, including partial periods, using the double-declining-balance method or some other specified method. vdb stands for variable declining balance. xirr ( values, dates, guess ) returns the internal rate of return for a schedule of cash flows that is not necessarily periodic. to calculate the internal rate of return for a series of periodic cash flows, use the irr function. xnpv ( rate, values, dates ) returns the net present value for a schedule of cash flows that is not necessarily periodic. to calculate the net present value for a series of cash flows that is periodic, use the npv function. yield ( settlement, matu-rity, rate, pr, redemp-tion, frequency, basis ) returns the yield on a security that pays periodic interest. you use yield to calculate bond yield. yielddisc ( settlement, maturity, pr, redemption, basis ) returns the annual yield for a discounted security. yieldmat (settlement, maturity, issue, rate, pr, basis ) returns the annual yield of a security that pays interest at maturity.us ing financ ial funct ions 348 2 part examples of common household loan and investment functions although excel is popular with banking and investment professionals, it is handy for just about any-one who deals with financial transactions. this first section of this chapter applies to anyone who is planning to buy a car or a house. with a little preplanning with excel, you can build simple work-sheets that allow you to calculate various monthly payments for various loan amounts. you need to keep in mind two universal rules when dealing with all financial functions: • make sure your time units are consistent. if you calculate a monthly loan payment, the interest rate argument should be expressed as a monthly figure. most interest rates are quoted as an annual figure, such as 5.5%. to convert, divide 5.5% by 12. • when money changes hands, consider the direction in which money flows. in any transaction, some cash flows toward you (positive), and some cash flows away from you (negative). if you try to enter all terms as positive, you end up with a result that is not meaningful. for example, suppose you want a car loan where the bank gives 20,000 at the beginning and then gives you another 377 per month. nper(5% /12,377,20000) would come up with an incorrect result for your problem because one of the cash flows needs to be negative. if you consider the loan from the point of view of the customer, the formula would be nper(5% /12,-377,20000). if you consider the loan from the point of view of the bank, the formula would be nper(5% /12,377,-20000). using pmt to calculate the monthly payment on an automobile loan buying a car is one of the most exciting purchases. whether the car is brand new or just new to you, nothing attracts attention in your neighborhood like a new car pulling into the driveway. before shopping for a car, you should take a 5-minute spin through excel to calculate potential car payments. knowing the price that will get you to the desired car payment will allow you to haggle with the sales rep from a position of knowledge. syntax: pmt(rate,nper,pv,fv,type) the pmt function calculates the payment for a loan based on constant payments and a constant interest rate. this function takes the following arguments: • rate—this is the interest rate for the loan. note that interest rate is often expressed as an annual rate. if you calculate a monthly payment, you have to divide that rate by 12. • nper—this is the term, or the total number of payments for the loan. • pv—this is the present value, or the loan amount; it is also known as the principal. • fv—this is an optional future value, or a cash balance you want to attain after the last payment is made. for a car payment calculation, this should be 0. if fv is omitted, it is assumed to be 0; that is, the future value of a loan is zero.349 examples of common household loan and investment funct ions 13 chapter • type—this is the number 0 or 1 and indicates when payments are due. the default value of 0 assumes that the first payment is due after a month has elapsed. if you have to make the first pay-ment on the day the loan is issued, you should set this value to 1. to watch a video of calculating loan payments, search for “excel in depth 13” at youtube for a reality check, try multiplying the calculated payment by nper. this way, you can calculate the total of all payments over the life of the loan. in figure 13.1, you see that a 29,000 car actually costs 32,835.95 in principal and interest. note the payment returned by pmt includes principal and interest but not taxes, insurance, escrow, or fees sometimes associated with loans. figure 13.1 pmt cal-culates a monthly loan pay-ment. using rate to determine an interest rate the pmt function is useful when you are considering a new loan. if you are analyze a loan that you have been paying for a while, you might know the monthly payment but forget the interest rate. the rate function can help you determine the rate. syntax: rate(nper,pmt,pv,fv,type,guess) the rate function returns the interest rate per period of an annu-ity. rate is calculated by iteration and can have zero or more solu-tions. if the successive results of rate do not converge to within 0.0000001 after 20 iterations, rate returns a #num! error. this function takes the following arguments: • nper—this is the total number of payment periods in an annuity. caution the algorithm behind the pmt function is new and more accu-rate in excel 2010. although this will not affect basic loan pay-ment calculations like the one shown here, be aware that excel 2007 and excel 2010 might pro-duce different results for some uses of pmt.us ing financ ial funct ions 350 2 part • pmt—this is the payment made each period and cannot change over the life of the annuity. typically, pmt includes principal and interest but no other fees or taxes. if pmt is omitted, you must include the fv argument. • pv—this is the present value—the total amount that a series of future payments is worth now. • fv—this is the future value, or a cash balance you want to attain after the last payment is made. if fv is omitted, it is assumed to be 0, which means the future value of a loan is zero. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. • guess—this is your guess for what the rate will be. if you omit guess, the rate is assumed to be 10%. if rate does not converge, you can try different values for guess. rate usually converges if guess is between 0 and 1. make sure you are consistent about the units you use for specifying guess and nper. if you make monthly payments on a 4-year loan at 12% annual interest, you use 12% / 12 for guess and 4 12 for nper. if you make annual payments on the same loan, you use 12% for guess and 4 for nper. figure 13.2 shows how to calculate an interest rate. figure 13.2 given the other terms for a loan, back into the interest rate with rate. using pv to figure out how much house you can afford if you are looking for a monthly house payment of 1,500 with a 15-year loan at 6% annual interest rate, you can back into the loan amount by using the pv function. syntax: pv(rate,nper,pmt,fv,type)351 examples of common household loan and investment funct ions 13 chapter the pv function returns the present value of an investment. the present value is the total amount that a series of future payments is worth now. for example, when you borrow money, the loan amount is the present value to the lender. this function takes the following arguments: • rate—this is the interest rate per period. for example, if you obtain an automobile loan at a 10% annual interest rate and make monthly payments, your interest rate per month is 10%/12, or 0.008333. therefore, you would enter 10% / 12, or 0.8333%, or 0.00833, into the formula as rate. • nper—this is the total number of payment periods in an annuity. for example, if you get a 4-year car loan and make monthly payments, your loan has 4 12 (or 48) periods. you would enter 48 into the formula for nper. • pmt—this is the payment made each period and cannot change over the life of the annu-ity. typically, pmt includes principal and interest but no other fees or taxes. for example, the monthly payments on a 10,000, 4-year car loan at 12% are 263.33. you would enter -263.33 into the formula for pmt. if pmt is omitted, you must include the fv argument. • fv—this is the future value, or a cash balance you want to attain after the last payment is made. if fv is omitted, it is assumed to be 0, which means the future value of a loan is zero. for exam-ple, if you want to save 50,000 to pay for a special project in 18 years, then 50,000 is the future value. you could then make a conservative guess at an interest rate and determine how much you must save each month. if fv is omitted, you must include the pmt argument. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. in figure 13.3, cell b5 calculates the loan principal amount that would result in the desired pay-ment, including principal and interest. you also need to budget for monthly insurance, taxes, and fees that might be a part of your monthly payment to the bank. figure 13.3 use pv to calculate how much you can bor-row to meet a monthly payment budget.us ing financ ial funct ions 352 2 part using nper to estimate how long a nest egg will last nper stands for number of periods. if you have a 401k retirement account and are trying to calculate how long you can withdraw fixed monthly payments from the account, use nper. syntax: nper(rate, pmt, pv, fv, type) the nper function returns the number of periods for an investment, based on periodic, constant pay-ments and a constant interest rate. this function takes the following arguments: • rate—this is the interest rate per period. • pmt—this is the payment made each period; it cannot change over the life of the annuity. typically, pmt contains principal and interest but no other fees or taxes. • pv—this is the present value, or the lump-sum amount that a series of future payments is worth right now. • fv—this is the future value, or a cash balance you want to attain after the last payment is made. if fv is omitted, it is assumed to be 0, which means the future value of a loan is zero. if you want to leave an inheritance to your kids, you use that amount as the fv. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. in figure 13.4, the nper function in cell b5 estimates how many months you can withdraw the amount in cell b2. note that the monthly withdrawal is negative from the point of view of the retire-ment account. figure 13.4 use nper to figure out how long an annu-ity can pay out before it ends in a zero bal-ance. using fv to estimate the future value of a regular savings plan the future value calculation assumes that you will make regular monthly payments to a savings plan every month. it also assumes that the interest rate does not change throughout the life of the savings plan. if you are young, it is likely that you can save more as your income grows later. however, using the savings calculator in figure 13.5 helps you to realize the value of regular savings.353 examples of common household loan and investment funct ions 13 chapter syntax: fv(rate,nper,pmt,pv,type) the fv function returns the future value of an investment, based on periodic, constant payments and a constant interest rate. this function takes the following arguments: • rate—this is the interest rate per period. • nper—this is the total number of payment periods in an annuity. • pmt—this is the payment made each period; it cannot change over the life of the annuity. typically, pmt contains principal and interest but no other fees or taxes. if pmt is omitted, you must include the pv argument. • pv—this is the present value, or the lump-sum amount that a series of future payments is worth right now. if pv is omitted, it is assumed to be 0, and you must include the pmt argument. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. for all the arguments, the cash you pay out, such as deposits to savings, is represented by negative numbers; the cash you receive, such as dividend checks, is represented by positive numbers. figure 13.5 shows how to use fv for a simple savings calculator. the formula in cell b8 assumes that you continue making the deposit each month from cell b4 until you retire and that interest rates remain constant. if you already have some amount in savings, you enter that in cell b6. figure 13.5 you can estimate the future value of a regular sav-ings plan. note note that the fv formula uses a negative version of cells b4 and b6. this occurs because these are amounts that leave your wal-let and go to the bank or mutual fund.us ing financ ial funct ions 354 2 part examples of functions for financial professionals whereas a typical consumer is interested in the amount of his or her monthly car payment, a loan maker is interested in the month-by-month breakdown of principal and interest. excel offers a com-plete cadre of functions to do these calculations. using ppmt to calculate the principal payment for any month after a bank writes a car loan, the consumer makes monthly payments. to calculate the principal portion of the payment for any period in the loan, you use ppmt. of course, you can use a range of these formulas—one for each month—to build an amortization table. syntax: ppmt(rate,per,nper,pv,fv,type) the ppmt function returns the payment on the principal for a given period for an investment, based on periodic, constant payments and a constant interest rate. this function takes the following argu-ments: • rate—this is the interest rate per period. • per—this specifies for which period the principal payment will be returned. it must be in the range 1 to nper. • nper—this is the total number of payment periods in an annuity. • pv—this is the present value—the total amount that a series of future payments is worth now. • fv—this is the future value, or a cash balance you want to attain after the last payment is made. if fv is omitted, it is assumed to be 0,which means the future value of a loan is zero. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. in figure 13.6, cell b9 calculates the principal payment for period 1. the per argument comes from the month number in column a. copying the formula down for all months produces an amortization table.355 examples of funct ions for financ ial profess ional s 13 chapter using ipmt to calculate the interest portion of a loan payment for any month whereas the ppmt function calculates the principal payment for any month of a loan, the ipmt function calculates the interest portion of the payment. the results of ipmt are shown in column c of figure 13.6. syntax: ipmt(rate,per,nper,pv,fv,type) the ipmt function returns the interest payment for a given period for an investment, based on periodic, constant payments and a constant interest rate. this function takes the following arguments: • rate—this is the interest rate per period. • per—this is the period for which you want to find the interest and must be in the range 1 to nper. • nper—this is the total number of payment periods in an annuity. note in this example, the interest component could be calculated with either pmt–ppmt or using the ipmt function. ipmt is dis-cussed in the next section. tip to generate the column of numbers starting in a9, enter the formula row(1:1) in cell a9. when you copy this formula down, the 1:1 reference will change to 2:2, 3:3, and so on. this is a fast way to generate a column of sequential num-bers using a single formula. alternatively, use row(a1) for the same result. figure 13.6 similar ppmt functions in b9:b56 cal-culate the monthly principal portion of the loan payment.us ing financ ial funct ions 356 2 part • pv—this is the present value, or the lump-sum amount that a series of future payments is worth right now. • fv—this is the future value, or a cash balance you want to attain after the last payment is made. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. the ipmt function is similar to the ppmt function. combined, they can create a simple amortization table (refer to figure 13.6). using cumipmt to calculate total interest payments during a time frame the cumipmt function is great for figuring out your yearly tax deduction for your mortgage interest. after specifying the typi-cal components of a loan such as the rate, term, and amount, you need to specify that you want to calculate the interest for particu-lar periods, such as periods 6 through 18. syntax: cumipmt(rate,nper,pv,start_period,end_period,type) the cumipmt function returns the cumulative interest paid on a loan between start_period and end_period. this function takes the following arguments: • rate—this is the interest rate. • nper—this is the total number of payment periods. • pv—this is the present value. • start_period—this is the first period in the calculation. payment periods are numbered begin-ning with 1. • end_period—this is the last period in the calculation. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. the nper, start_period, end_period, and type arguments are truncated to integers. if rate is less than or equal to 0, nper is less than or equal to 0, or pv is less than or equal to 0, cumipmt returns a #num! error. if start_period is less than 1, end_period is less than 1, or start_period caution the algorithm behind the cumipmt function is new and more accurate in excel 2010. be aware that excel 2007 and excel 2010 might produce dif-ferent results for some uses of cumipmt. caution the algorithm behind the ipmt function is new and more accu-rate in excel 2010. be aware that excel 2007 and excel 2010 might produce different results for some uses of ipmt. note you may encounter an old work-sheet that uses ispmt, which is the lotus 1-2-3 version of ipmt. for details on ispmt, see excel help. for new worksheets, you should use ipmt instead of ispmt.357 examples of funct ions for financ ial profess ional s 13 chapter using cumprinc to calculate total principal paid in any range of periods the corollary to cumipmt is a function to calculate the total principal paid during any range of peri-ods of a loan: cumprinc. syntax: cumprinc(rate,nper,pv,start_period,end_period,type) the cumprinc function returns the cumulative principal paid on a loan between start_period and end_period. this function takes the following arguments: • rate—this is the interest rate. • nper—this is the total number of payment periods. figure 13.7 use column e to plan your tax deductions by year. is greater than end_period, cumipmt returns a #num! error. if type is any number other than 0 or 1, cumipmt returns a #num! error. figure 13.7 calculates the total interest paid during each year of the loan. the mildly difficult portion of the sample spreadsheet is that the number of months in the first year will likely be less than 12. cell d12 uses 13-month(b5). cell c13 uses d121. cell d13 uses c1211 to calculate the last period for each year. column f of this spreadsheet uses cumprinc, which is discussed in the next section.us ing financ ial funct ions 358 2 part • pv—this is the present value. • start_period—this is the first period in the calculation. payment periods are numbered begin-ning with 1. • end_period—this is the last period in the calculation. • type—this is the number 0 or 1 to indicate when payments are due. the default value of 0 assumes that payments are due at the end of the period. a value of 1 means the payments are due at beginning of each period. the nper, start_period, end_period, and type argu-ments are truncated to integers. if rate is less than or equal to 0, nper is less than or equal to 0, or pv is less than or equal to 0, cumprinc returns a #num! error. if start_period is less than 1, end_period is less than 1, or start_period is greater than end_period, cumprinc returns a #num! error. if type is any num-ber other than 0 or 1, cumprinc returns a #num! error. figure 13.7 shows an example of cumprinc. using effect to calculate the effect of compounding period on interest rates does it really matter if your bank compounds interest daily, monthly, or quarterly? if the numbers are big enough, it can matter. the effect function converts an interest rate to an effective rate, depending on how frequently the bank compounds the interest. syntax: effect(nominal_rate,npery) the effect function returns the effective annual interest rate, given the nominal annual interest rate and the number of compounding periods per year. this function takes the following arguments: • nominal_rate—this is the nominal interest rate. • npery—this is the number of compounding periods per year. npery is truncated to an integer. if either argument is nonnumeric, effect returns a #value! error. if nominal_rate is less than or equal to 0 or if npery is less than 1, effect returns a #num! error. in figure 13.8, the nominal interest rate is 6%. if the bank compounds interest once per year, the effective interest rate is still 6%, as shown in cell a5. if interest is compounded monthly, the effec-tive rate increases to 6.17%. row 9 compares the monthly mortgage payment at the various effec-tive rates. daily compounding adds about 23 per month to a typical mortgage payment. caution the algorithm behind the cumprinc function is new and more accurate in excel 2010. be aware that excel 2007 and excel 2010 might produce dif-ferent results for some uses of cumprinc.359 examples of deprec iat ion funct ions 13 chapter using nominal to convert the effective interest rate to a nominal rate if you need to compare two investments, one quoting a nominal rate and one quoting an effective rate, you can convert the effective rate to a nominal rate by using nominal. syntax: nominal(effect_rate,npery) the nominal function returns the nominal annual interest rate, given the effective rate and the number of compounding periods per year. this function takes the following arguments: • effect_rate—this is the effective interest rate. • npery—this is the number of compounding periods per year. the npery argument is truncated to an integer. if either argument is nonnumeric, nominal returns a #value! error. if effect_rate is less than or equal to 0 or if npery is less than 1, nominal returns a #num! error. examples of depreciation functions when a company buys a large asset such as a piece of machinery, accounting rules specify how the asset should be expensed each year. this is called depreciation. excel offers four common methods for calculating depreciation: straight-line, declining-balance, double-declining-balance, and sum-of-years’-digits methods. figure 13.8 row 5 shows the effective interest rates for various compounding periods. row 9 shows the monthly payment difference.us ing financ ial funct ions 360 2 part the following terms are common to all the depreciation methods: • cost—this is the initial cost of the asset. for example, the machinery might cost 120,000. • useful life—this is how long you expect to use the asset. if you think the machinery will be used for 10 years before being replaced, the life is 10 years. • salvage value— this is the value of the asset at the end of the useful life. perhaps after 10 years, you can sell the machine to a scrap dealer for 1,000 or to a trade school for 5,000. this is the salvage value. figure 13.9 compares the four depreciation methods. figure 13.9 columns b through e compare four methods of depreciation. using sln to calculate straight-line depreciation the straight-line method is the simplest depreciation method. using this method, the value of the asset is depreciated evenly over the asset’s useful life. at the end of the useful life, the item is depreciated on the company’s books to the salvage value level. syntax: sln(cost,salvage,life)361 examples of deprec iat ion funct ions 13 chapter the sln function returns the straight-line depreciation of an asset for one period. this function takes the following arguments: • cost—this is the initial cost of the asset. • salvage—this is the asset’s value at the end of the depreciation period. sometimes this is called the salvage value of the asset. • life—this is the number of periods over which the asset is being depreciated. sometimes this is called the useful life of the asset. using db to calculate declining-balance depreciation in the declining-balance method, depreciation happens at a con-stant rate. the advantage of this method is that more depreciation happens in the earlier years, providing a better tax benefit in early years. let’s look at a simple example. suppose that a 100,000 asset is depreciated 20% in year 1. this results in a 20,000 depreciation expense. after year 1, the asset would be have a value of 80,000 on the books. in year 2, the remaining balance of 80,000 is multi-plied by the same 20% rate to yield a depreciation of 16,000. the depreciation in year 3 is 20% of the remaining 64,000, or 12,800. the trick to this method is figuring out the correct percentage to use for each year. this involves fractional exponents and a little algebra. if you use the db function, however, you do not have to worry about any of that. excel calculates this rate, rounded to three decimal places, as the first step in the process. this rounding to three decimal places causes the calculation to be off by a few dol-lars at the end of the useful life. syntax: db(cost,salvage,life,period,month) the db function returns the depreciation of an asset for a specified period, using the fixed-declining-balance method. this function takes the following arguments: • cost—this is the initial cost of the asset. • salvage—this is the value at the end of the depreciation period. sometimes this is called the salvage value of the asset. • life—this is the number of periods over which the asset is being depreciated. sometimes this is called the useful life of the asset. • period—this is the period for which you want to calculate the depreciation. period must use the same units as life. • month—this is the number of months in the first year. if month is omitted, it is assumed to be 12. note see excel help for this function for details on special handling of year 1 and the last year, as well as the algebra behind the rate formula.us ing financ ial funct ions 362 2 part using ddb to calculate double-declining-balance depreciation the double-declining-balance method is an aggressive (and legal) method for calculating deprecia-tion. suppose you purchased a computer. in the first year, the item might be state-of-the-art. by year 2, it is worth far less because technology would have passed the computer by. the name of this method reflects the fact that the depreciation rate is double the normal rate but also that the depreciation rate is applied to the declining balance of the asset’s value. if the asset is depreciated over 5 years, the normal straight-line rate would be 20%. in the double-declining-balance method, you get to use 40% in each year. for example, the first year, deprecia-tion on a 100,000 asset would be 40%. but in year 2, the 40% is multiplied by the remaining asset value of 60,000. this method generates much higher depreciation in the first few years of the asset life than the other methods. although the name of this method contains the world double, microsoft covered the possibility of other multipliers. there is a 150db method that multiplies the rate by 1.5 instead of 2. to cal-culate 150db, you use 1.5 as the fifth argument. if no fifth argu-ment is supplied, the fifth argument is assumed to be 2, resulting in ddb. syntax: ddb(cost,salvage,life,period,factor) the ddb function returns the depreciation of an asset for a specified period using the double-declining-balance method or some other specified method. this function takes the following arguments: • cost—this is the initial cost of the asset. • salvage—this is the value at the end of the depreciation period. • life—this is the number of periods over which the asset is being depreciated. • period—this is the period for which you want to calculate the depreciation. the period must use the same units as life. • factor—this is the rate at which the balance declines. if factor is omitted, it is assumed to be 2, which is the double-declining-balance method. to allow ddb to work, you need to abandon the method at some point and switch to a straight-line method for the remaining asset value. if you attempt to use ddb for the entire life of the asset, you will not write off enough of the value. figure 13.10 illustrates how ddb fails to accumulate 500,000 of depreciation. you might want to use the newer vdb method, which automatically switches for you. column d in figure 13.10 shows this method. note in many depreciation systems, you are allowed to switch from double-declining-balance to the straight-line method when the straight-line method produces a higher depreciation. to do this, use the vdb function, which is described later in this chapter. tip keep in mind that all five of the arguments listed previously must be positive numbers.363 examples of deprec iat ion funct ions 13 chapter to overcome this problem with ddb, you can use the vdb method. the vdb function is a far more powerful function. using vdb to calculate a double-declining-balance problem correctly is somewhat like using a sledgehammer to push in a thumbtack. vdb is covered in detail later in this chapter, but you can follow these steps to solve the current problem: 1. change the function name from ddb to vdb. 2. because both ddb and vdb take the same first three arguments—cost, salvage, and life— leave those three arguments alone. 3. change period number in ddb to start period and end period for vdb. cell c6 specifies a6 as the period number. change this argument to a6-1,a6. this is a bit strange because you are asking vdb to calculate the depreciation from the end of year 0 to the end of year 1. 4. determine whether the ddb function is done. factor, which is usually left off the function, is assumed to be 2. if ddb has no fifth argument, then vdb does not need a fifth argument. 5. to allow vdb to switch to the straight-line method, ensure that the sixth argument is false. the name of this argument is no_switch. by specifying false, you are invoking a double negative to ask vdb to switch to the straight-line method when appropriate. because false is the default, you can often leave off the fifth and sixth arguments with vdb. complete details on the more powerful uses of vdb are provided later in this chapter. figure 13.10 the ddb method fails to accumulate enough depreciation.us ing financ ial funct ions 364 2 part using syd to calculate sum-of-years’-digits depreciation the sum-of-years’-digits method is another accelerated depreciation system. it ensures that the value of the asset drops more in the earlier years of the asset’s life than in later years. suppose you have an asset with a useful life of seven years. you need to add all the years from seven to one: 7 6 5 4 3 2 1 28. in the first year, you can write off 7 / 28 of the value. in the next year, you can write off 6 / 28. in successive years, you can write off 5 / 28, 4 / 28, 3 / 28, 2 / 28, and 1 / 28 of the depreciable value. syntax: syd(cost,salvage,life,per) the syd function returns the sum-of-years’-digits depreciation of an asset for a specified period. this function takes the following arguments: • cost—this is the initial cost of the asset. • salvage—this is the value at the end of the depreciation period. sometimes this is called the salvage value of the asset. • life—this is the number of periods over which the asset is being depreciated. sometimes this is called the useful life of the asset. • per—this is the period and must use the same units as life. using vdb to calculate depreciation for any period as mentioned in the discussion of the ddb function, the vdb function is newer and far more power-ful than the other depreciation functions. it is interesting for tax purposes to know the annual depreciation amounts. however, if you work for a public company, you have to report depreciation at least quarterly. figure 13.11 shows an example that calculates the exact depreciation to be booked each quarter. syntax: vdb(cost,salvage,life,start_period,end_period,factor,no_switch) the vdb function returns the depreciation of an asset for any specified period, including partial periods, using the double-declining-balance method or some other specified method. vdb stands for variable declining balance. the vdb function takes the following arguments: • cost—this is the initial cost of the asset. • salvage—this is the value at the end of the depreciation period. • life—this is the number of periods over which the asset is being depreciated. to calculate depreciation for periods smaller than a year, multiply the number of years by 12, or even 365.365 examples of deprec iat ion funct ions 13 chapter • start_period—this is the starting period for which you want to calculate the depreciation. start_period must use the same units as life. • end_period—this is the ending period for which you want to calculate the depreciation. end_ period must use the same units as life. • factor—this is the rate at which the balance declines. if factor is omitted, it is assumed to be 2, which is the double-declining-balance method. you change factor if you do not want to use the double-declining-balance method. • no_switch—this is a logical value that specifies whether to switch to straight-line depreciation when depreciation is greater than with the declining-balance calculation. if this is false or omitted, excel switches to the straight-line method when it becomes more beneficial to do so. if this value is true, excel holds on to the ddb method until the end of life. to set up a schedule that shows depreciation for each quarter, you follow these steps: 1. enter the cost, salvage value, and useful life at the top of the worksheet. 2. enter the date on which the equipment is placed in service in cell b4. figure 13.11 vdb allows you to calculate depreciation for each month or quarter. tip keep in mind that all the argu-ments listed previously, except no_switch, must be positive numbers.us ing financ ial funct ions 366 2 part 3. enter dates for the first quarter in cells a7 and b7. the value in cell a7 is the date the unit is placed in service. manually figure out the last date of the quarter for cell b7. 4. ensure that the formula for columns a and b in each subsequent row is the same. in cell a8, enter b71. in cell b8, enter is eomonth(a8,2). the eomonth function reports the end of the month that falls two months after what is shown in cell a8. copy these formulas down as far as necessary. 5. to build the vdb function, use the normal values for cost and salvage value. instead of 7 for life, use 7 365 to have the function calculate a daily depreciation rate. 6. for start_period, use the date in column a minus the date in service. 7. for end_period, use the date in column b minus the date in service. 8. if you are using the double-declining-balance method, omit the fifth and sixth arguments. 9. copy the vdb function down to all your rows. the table shown in figure 13.11 shows the depreciation to be booked each quarter for this particu-lar piece of machinery. functions for investment analysis the invention of the computer spreadsheet in 1979 enabled the rapid growth of the mergers and acquisitions business in the 1980s. business plans can be modeled in excel, with the resulting series of net income values discounted to determine the current value of a business. excel offers a wide array of functions that can be used to analyze a business investment. using the npv function to determine net present value suppose that you have a pile of cash. you have the opportunity to invest that cash in a long-term cd that earns 2% interest. you also have the opportunity to use that cash to buy a business. the 2% is called the hurdle rate. if the business cannot return more than the 2% hurdle rate, you should probably look for another business. you have analyzed the business plan and projected that the business will generate a certain series of net income over each of the next five years. you can analyze the net present value of the invest-ment by using the npv function. syntax: npv(rate,value1,value2,...) the npv function calculates the net present value of an investment by using a discount rate and a series of future payments (negative values) and income (positive values). this function takes the fol-lowing arguments: note if you happen to work for a french-owned company, keep in mind special considerations when calculating deprecia-tion. read the help topics for amordegrc and amorlinc to understand these methods that have been added to excel to accommodate the french accounting rules.367 funct ions for investment analys i s 13 chapter • rate—this is the rate of discount over the length of one period. • value1,value2,...—these are 1 to 254 arguments representing the payments and income. instead, you can refer to a range of values. value1, value2,... must be equally spaced in time and occur at the end of each period. the function uses the order of value1, value2,... to interpret the order of cash flows. you need to be sure to enter your payment and income val-ues in the correct sequence. the npv investment begins one period before the date of the value1 cash flow and ends with the last cash flow in the list. arguments of value1, value2,... are cash flows at the end of year 1, year 2, and so on. in this example, if you buy a business for 50,000, this amount should not be entered as a value in the function. instead, you should subtract the 50,000 from the result of npv. npv is similar to the pv function. the primary difference between pv and npv is that pv allows cash flows to begin either at the end or at the beginning of the period. unlike the variable npv cash flow values, pv cash flows must be constant throughout the investment. npv is also related to the irr function. irr is the rate for which npv equals zero: npv(irr(...), ...) 0. in figure 13.12, the business will cost 50,000. the business will lose 5,000 in year 1 and then gen-erate 61,000 over the next 4 years. based on these cash flows, npv is positive, which means that the investment will do better than a cd at 2% interest. figure 13.12 npv can analyze a periodic series of cash flows. note npv requires the cash flows to occur at a regular rate. if you instead have a series of projected cash flows on vary-ing dates, you should use xnpv instead. see the section “ using xnpv to calculate the net present value when the payments are not periodic ,” later in this chapter.us ing financ ial funct ions 368 2 part using irr to calculate the return of a series of cash flows in the previous section, you used the npv function to determine whether a business investment met or did not meet a certain desired rate of return. in figure 13.12, npv is positive, indicating that the business was able to beat a 2% return after five years. if you want to figure out the internal rate of return, use the irr function. one critical difference exists between irr and npv: in the npv function, the initial investment in the business is not included in the list of arguments. in the irr function, the initial investment in the business needs to be included as the first cash flow. because this is money paid for the business, it should be negative. syntax: irr(values,guess) the irr function returns the internal rate of return for a series of cash flows, represented by the numbers in values. these cash flows do not have to be even, as they would be for an annuity. however, the cash flows must occur at regular intervals, such as monthly or annually. the internal rate of return is the interest rate received for an investment, consisting of payments (negative val-ues) and income (positive values) that occur at regular periods. the irr function takes the following arguments: • values—this is an array or a reference to cells that contain numbers for which you want to calculate the internal rate of return. values must contain at least one positive value and one negative value to calculate the internal rate of return. irr uses the order of values to interpret the order of cash flows. you need to be sure to enter your payment and income values in the sequence you want. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored. • guess—this is a number that you guess is close to the result of irr. microsoft excel uses an iterative technique for calculating irr. starting with guess, irr cycles through the calculation until the result is accurate within 0.00001%. if irr cannot find a result that works after 20 tries, a #num! error is returned. in most cases, you do not need to provide guess for the irr calculation. if guess is omitted, it is assumed to be 0.1, which is 10%. if irr gives a #num! error, or if the result is not close to what you expected, you can try again with a different value for guess. irr is closely related to npv, the net present value function. the rate of return calculated by irr is the interest rate corresponding to a net present value of zero. the following formula demonstrates how npv and irr are related: enter npv(irr(b1:b6),b1:b6) in a cell, which equals 3.60e-08. within the accuracy of the irr calculation, the value 3.60e-08 is effectively zero. note irr fails to take into account that the money earned in year 1 could start generating interest if invested in a cd. to calculate a rate of return including the rein-vestment of profits, use mirr, which is described in the follow-ing section. caution the algorithm behind the irr function is new and more accu-rate in excel 2010. be aware that excel 2007 and excel 2010 might produce different results for some uses of irr.369 funct ions for investment analys i s 13 chapter in figure 13.12, the formula in cell b17 shows that the business investment would generate a rate of return of 2.7% if analyzed over a 5-year period. the arguments for this function include the initial 50,000 investment in the business as well as the net incomes from the next 5 years. similar formulas in cells b14 and b15 return a #num! error. the formulas were edited to add a guess value. based on the –12% return through 4 years, guess for three years was –10%. using mirr to calculate internal rate of return, including interest rates mirr calculates a modified internal rate of return. this function assumes that cash flows from the business are reinvested at some interest rate. it also offers an argument to specify the initial interest rate of the business loan used to purchase the business. syntax: mirr(values,finance_rate,reinvest_rate) the mirr function returns the modified internal rate of return for a series of periodic cash flows. mirr considers both the cost of the investment and the interest received on reinvestment of cash. this function takes the following arguments: • values—this is an array or a reference to cells that contain numbers. these numbers represent a series of payments (negative values) and income (positive values) occurring at regular periods. values must contain at least one positive value and one negative value to calculate the modified internal rate of return. otherwise, mirr returns a #div/0! error. if an array or a reference argu-ment contains text, logical values, or empty cells, those values are ignored; however, cells with the value 0 are included. • finance_rate—this is the interest rate you pay on the money used in the cash flows. • reinvest_rate—this is the interest rate you receive on the cash flows as you reinvest them. mirr uses the order of values to interpret the order of cash flows. you need to be sure to enter your payment and income values in the sequence you want and with the correct signs. in other words, enter positive values for cash received and negative values for cash paid. in figure 13.13, you are analyzing a business that was started 5 years ago with a 120,000 loan. the business has generated profits of 17,000, 34,000, 38,000, 5,000, and 32,000. the original loan had an interest rate of 5%, and the profits were reinvested at 2.25%. the mirr in cell b10 is 1.9%. for comparison, the irr of the same cash flows would be only 1.64%. using xnpv to calculate the net present value when the payments are not periodic the previous examples assume that everything happens on the last day of each year. in reality, the business purchase date and the business sales date might occur on other days. in such a case, you use xnpv.us ing financ ial funct ions 370 2 part syntax: xnpv(rate,values,dates) the xnpv function returns the net present value for a schedule of cash flows that is not necessarily periodic. to calculate the net present value for a series of cash flows that is periodic, you use the npv function. the xnpv function takes the following arguments: • rate—this is the discount rate to apply to the cash flows. • values—this is a series of cash flows that corresponds to a schedule of payments in dates. the first payment is optional and corresponds to a cost or payment that occurs at the beginning of the investment. if the first value is a cost or payment, it must be a negative value. all succeeding payments are discounted based on a 365-day year. the series of values must contain at least one positive value and one negative value. • dates—this is a schedule of payment dates that corresponds to the cash flow payments. the first payment date indicates the beginning of the schedule of payments. all other dates must be later than this date, but they may occur in any order. only dates are considered; any times appended to the dates are truncated. if any argument is nonnumeric, xnpv returns a #value! error. if any number in dates is not a valid date, xnpv returns a # num! error. if any number in dates precedes the starting date, xnpv returns a #num! error. if values and dates contain different numbers of values, xnpv returns a #num! error. in figure 13.14, the company was purchased on march 15, 2001. the company posted no net profit in 2002. the company was sold in february 2006. the xnpv function in row 9 shows that this deal clearly beat the 4% hurdle rate. figure 13.13 you can determine a modi-fied rate of return, figuring in a financing rate and the inter-est rate for reinvested profits.371 funct ions for investment analys i s 13 chapter using xirr to calculate a return rate when cash flow dates are not periodic as in the xnpv example, you can calculate an internal rate of return for a business deal where the dates do not necessarily fall on the last day of the year. to do so, use xirr, as shown in the exam-ple at the bottom of figure 13.14. syntax: xirr(values,dates,guess) the xirr function returns the internal rate of return for a schedule of cash flows that is not neces-sarily periodic. to calculate the internal rate of return for a series of periodic cash flows, use the irr function. this function takes the following arguments: • values—this is a series of cash flows that corresponds to a schedule of payments in dates. the first payment is optional and corresponds to a cost or payment that occurs at the beginning of the investment. if the first value is a cost or payment, it must be a negative value. all succeeding payments are discounted based on a 365-day year. the series of values must contain at least one positive and one negative value. • dates—this is a schedule of payment dates that corresponds to the cash flow payments. the first payment date indicates the beginning of the schedule of payments. all other dates must be later than this date, but they may occur in any order. • guess—this is a number that you guess is close to the result of xirr. figure 13.14 xnpv takes into account a series of cash flows on a series of dates. the dates do not have to have identical periods, as in npv.us ing financ ial funct ions 372 2 part numbers in dates are truncated to integers. xirr expects at least one positive cash flow and one negative cash flow; other-wise, xirr returns a #num! error. if any number in dates is not a valid date, xirr returns a #num! error. if any number in dates precedes the starting date, xirr returns a #num! error. if values and dates contain different numbers of values, xirr returns a #num! error. in most cases, you do not need to provide guess for the xirr calculation. if it is omitted, guess is assumed to be 0.1, which is 10%. xirr is closely related to xnpv, the net present value function. the rate of return calculated by xirr is the interest rate corresponding to xnpv 0. examples of functions for bond investors a bond is an i.o.u. in which you lend an amount to the issuer. the issuer pays you periodic inter-est payments and at the maturity date of the bond returns your money. various governments issue many bonds. bond maturities can extend anywhere from 1 day to 30 years. many concepts and terms apply to the bond functions. for the following discussion, let’s assume that a city issues a 30-year municipal bond. the bond is issued on july 1, 2011. the bond’s maturity date is june 30, 2040. the city agrees to pay 5% interest semiannually. here is what makes bonds interesting: they can be bought and sold after the issue date. suppose that 14 months have gone past. interest rates have now risen. the bond is going to keep paying 5% interest for the next 30 years. if interest rates have moved above 5%, a potential buyer of the bond will not want to pay 1,000 for the bond. instead, the buyer might pay 950. thus, there is a price paid for the bond, and there is a value of the bond at maturity. many bond functions ask for these arguments: • settlement—this is the day that the buyer purchases the bond. it might be the issue date but is usually after the issue date. in the preceding example, the settlement date is september 1, 2011. • maturity—this is the day that the issuer will pay the face value of the bond. in the preceding example, the maturity date is june 30, 2040. • rate—this is the published coupon rate for the bond. in the preceding example, it is 5%. • pr—this is the price that the current buyer paid for the bond. if the bond was purchased on the issue date, the price matches the face value of the bond. if it was purchased on a later date, the price is higher or lower than the face value, depending on whether interest rates go up or down. for example, if interest rates go up, bond prices go down. if interest rates go down, bond prices go up. the pr is expressed as the price per 100 of face value. if you buy a 1,000 face-value bond for 950, the price is 95 per 100, so you enter 95 for the price argument. • redemption—this is the value of the bond on the maturity date. it is the amount the issuer will pay back to the holder of the bond. the price is expressed as the price per 100 of face value. if caution the algorithm behind the xirr function is new and more accu-rate in excel 2010. be aware that excel 2007 and excel 2010 might produce different results for some uses of xirr.373 examples of funct ions for bond investors 13 chapter you buy a 1,000 face-value bond that will pay 1,000 at maturity, you enter 100 for the redemp-tion argument. • frequency—this is the number of interest payments per year. for semiannual interest pay-ments, enter 2. for quarterly payments, enter 4. for annual payments, enter 1. in excel, these are the only three frequency values allowed. • basis—this is a code used to identify the number of days in a year. the values are the same as those available in the yearfrac function. for most u.s. bonds, basis is 0 to indicate a 30/360 nasd calendar. for european bonds, consult excel help for the yield function. for more information on basis, see the yearfrac discussion in chapter 11 , “using everyday functions: math, date and time, and text functions.” using yield to calculate a bond’s yield a 1,000 bond might promise to pay 5% interest. however, if you buy the bond on the secondary market for 95, the actual yield will not be 5%. as you are trying to compare various investments, comparing the yield is one way to decide between multiple investment opportunities. to do this, you can use excel’s yield function. syntax: yield(settlement,maturity,rate,pr,redemption,frequency,basis) the yield function returns the yield on a security that pays periodic interest. you use yield to cal-culate bond yield. this function takes the following arguments: • settlement—this is the security’s settlement date. the security settlement date is the date after the issue date when the security is traded to the buyer. dates may be entered as the following: text strings within quotation marks, such as “1/30/1998” and “1998/01/30” serial numbers such as 39156, which represents march 15, 2010 results of other formulas or functions, such as date(2010,3,15) • maturity—this is the security’s maturity date. the maturity date is the date when the security expires. • rate—this is the security’s annual coupon rate. • pr—this is the security’s price per 100 face value. • redemption—this is the security’s redemption value per 100 face value. • frequency—this is the number of coupon payments per year. for annual payments, frequency is 1; for semiannual, frequency is 2; and for quarterly, frequency is 4. • basis—this is the type of day count basis to use. it defaults to 0, which is appropriate for u.s. bonds.us ing financ ial funct ions 374 2 part the settlement, maturity, frequency, and basis arguments are truncated to integers. if settlement or maturity is not a valid date, yield returns a #num! error. if rate is less than 0, yield returns a #num! error. if pr is less than or equal to 0 or if redemption is less than or equal to 0, yield returns a # num! error. if frequency is any number other than 1, 2, or 4, yield returns a #num! error. if basis is less than 0 or if basis is greater than 4, yield returns a #num! error. if settlement is greater than or equal to maturity, yield returns a #num! error. figure 13.15 shows an example of yield. figure 13.15 use yield to calcu-late the yield rate for a bond. in these examples, the price in row 4 changes. using price to back into a bond price if you know the yield for a bond, you can use price to calculate the price per 100 of face value. syntax: price(settlement,maturity,rate,yld,redemption,frequency,basis) the price function returns the price per 100 face value of a security that pays periodic interest. this function takes the following arguments: • settlement—this is the security’s settlement date, which is the date on which you purchased the bond. • maturity—this is the security’s maturity date, which is the date when the security expires. • rate—this is the security’s annual coupon rate. • yld—this is the security’s annual yield. • redemption—this is the security’s redemption value per 100 face value.375 examples of funct ions for bond investors 13 chapter • frequency—this is the number of coupon payments per year. for example, use 2 for semian-nual. • basis—this is the type of day count basis to use. for example, use 0 for u.s. bonds. the settlement, maturity, frequency, and basis arguments are truncated to integers. if set-tlement or maturity is not a valid date, price returns a #num! error. if yld is less than 0 or if rate is less than 0, price returns a #num! error. if redemption is less than or equal to 0, price returns a #num! error. if frequency is any number other than 1, 2, or 4, price returns a #num! error. if basis is less than 0 or if basis is greater than 4, price returns a #num! error. if settle-ment is greater than or equal to maturity, price returns a #num! error. in figure 13.16, the yield for the bond exceeds the coupon rate. this indicates that the price will be less than 100. figure 13.16 if you know the yield, you can back into the price by using the price function. when a bond is sold on the secondary market, it is often sold in between interest payments. each interest payment date is called a coupon date. you analyze days until the next coupon date by using the coup functions. a whole series of coup functions analyze the coupon period. the functions can tell you the previous coupon date, the next coupon date, how many days since the previous coupon date, and how many days until the next coupon date: • coupdays—this returns the number of days in this coupon period. • coupdaybs—this returns the number of days from the beginning of the coupon period until the settlement date. the bs in the function name stands for from beginning to settlement.us ing financ ial funct ions 376 2 part • coupdaysnc—this returns the number of days from the settlement until the next coupon date. nc stands for next coupon. • couppcd—this returns the date of the previous coupon date. • coupncd—this returns the date of the next coupon date. • coupnum—this returns the number of coupon dates left until maturity. all the coup functions require the same four arguments: settlement, maturity, frequency, and basis. for an explanation of these arguments, see the sections on yield and price. figure 13.17 shows these coupon functions for a particular security. figure 13.17 you can analyze what portion of a coupon period has gone past at the settle-ment date for the bond. using received to calculate total cash generated from a bond investment when you buy a bond, your settlement date is probably between two coupon dates. unless you are buying the bond on the issue date, you receive less than the complete number of interest payments. to calculate the total future cash flows from a bond from the day you buy it until the maturity date, you use the received function. syntax: received(settlement,maturity,investment,discount,basis) the received function returns the amount received at maturity for a fully invested security. this function takes the following arguments: • settlement—this is the security’s settlement date, which is the date on which you purchased the security.377 examples of funct ions for bond investors 13 chapter • maturity—this is the security’s maturity date, which is the date when the security expires. • investment—this is the amount invested in the security. • discount—this is the security’s discount rate. • basis—this is the type of day count basis to use. for example, use 0 for u.s. bonds. the settlement, maturity, and basis arguments are truncated to integers. if settlement or maturity is not a valid date, received returns a #num! error. if investment is less than or equal to 0 or if discount is less than or equal to 0, received returns a #num! error. if basis is less than 0 or if basis is greater than or equal to 4, received returns a #num! error. if settlement is greater than or equal to maturity, received returns a #num! error. in figure 13.18, columns b, c, and d show the total received for a bond purchased on various dates. the function takes into account the days to the next coupon date. figure 13.18 in the 15 days between cells b1 and c1, you lose 2.37 in interest. using intrate to back into the coupon interest rate if you have a fully invested bond and know what it will pay on maturity, you can use excel’s intrate function to back into the interest rate. syntax: intrate(settlement,maturity,investment,redemption,basis) the intrate function returns the interest rate for a fully invested security. this function takes the following arguments: • settlement—this is the security’s settlement date. • maturity—this is the security’s maturity date.us ing financ ial funct ions 378 2 part • investment—this is the amount invested in the security. • redemption—this is the amount to be received at maturity. • basis—this is the type of day count basis to use. you use 0 for u.s. bonds. the settlement, maturity, and basis arguments are truncated to integers. if settlement or maturity is not a valid date, intrate returns a #num! error. if investment is less than or equal to 0 or if redemption is less than or equal to 0, intrate returns a #num! error. if basis is less than 0 or if basis is greater than 4, intrate returns a #num! error. if settlement is greater than or equal to maturity, intrate returns a #num! error. intrate calculates (redemption value – investment) / investment and multiplies this by (number of days in year / days from settlement to maturity). in figure 13.19, excel uses intrate to back into the interest rate that the bond is paying. figure 13.19 the intrate function can be used to derive the underlying interest rate for the bond. using disc to back into the discount rate if you have a security and know the price, you can back into the discount rate by using disc.379 examples of funct ions for bond investors 13 chapter syntax: disc(settlement,maturity,pr,redemption,basis) the disc function returns the discount rate for a security. it takes the following arguments: • settlement—this is the security’s settlement date. • maturity—this is the security’s maturity date. the maturity date is the date when the security expires. • pr—this is the security’s price per 100 face value. • redemption—this is the security’s redemption value per 100 face value. • basis—this is the day count basis. you use 0 for u.s. bonds. the settlement, maturity, and basis arguments are truncated to integers. if settlement or maturity is not a valid date, disc returns a #num! error. if pr is less than or equal to 0, or if redemp-tion is less than or equal to 0, disc returns a #num! error. if basis is less than 0 or if basis is greater than 4, disc returns a #num! error. if settlement is greater than or equal to maturity, disc returns a #num! error. disc calculates (redemption value – par value) / par value and multiplies this by (number of days in year / days from settlement to maturity). in figure 13.20, excel uses disc to back into the discount rate. figure 13.20 the disc function can be used to derive the underly-ing discount rate for a bond.us ing financ ial funct ions 380 2 part handling bonds with an odd number of days in the first or last period excel provides four functions—oddfprice, oddfyield, oddlprice, and oddlyield—to handle the special case in which a bond has a short or long first or last period. this period has more or fewer days than all the other periods and is called an odd period. for an explanation of the arguments to these functions, see the information on the price and yield functions earlier in this chapter. syntax: oddfprice(settlement,maturity,issue,first_coupon,rate,yld,redemption,frequency,b asis) and oddfyield (settlement,maturity,issue,first_coupon,rate,pr,redemption,f requency,basis) the oddf functions handle cases in which the first period has an odd number of days. these are oddfprice and oddfyield. each function has the extra argument first_coupon, which specifies the date for the odd first period. syntax: oddlprice(settlement,maturity,last_interest,rate,yld,redemption,frequency,basis) and oddlyield (settlement,maturity,last_interest,rate,pr,redemption,frequency, basis) the oddl functions handle cases in which the last period has an odd number of days. these are oddlprice and oddlyield. these functions have the extra argument last_interest, which is the date of the final interest payment before maturity. using this date, excel can determine the length of the time for the last period. using pricemat and yieldmat to calculate price and yield for zero-coupon bonds a zero-coupon bond does not pay interest on the coupon dates. all interest is paid at maturity. excel provides pricemat and yieldmat to calculate price and yield for these securities. figure 13.21 illus-trates both of these functions. syntax: pricemat(settlement_date,maturity_date,issue_date,rate_at_date_of_issue,annual_ yield,day_basis) the pricemat function returns the price per 100 face value of a security that pays interest at maturity.381 examples of funct ions for bond investors 13 chapter syntax: yieldmat(settlement_date,maturity_date,issue_date,rate_at_date_of_issue,price_ per_100_of_face_value,day_basis) the yieldmat function returns the annual yield of a security that pays interest at maturity. using pricedisc and yielddisc to calculate discount bonds excel provides pricedisc and yielddisc for calculating discounted bonds. figure 13.22 illustrates these functions. syntax: pricedisc(settlement_date,maturity_date,discount_rate,redemption_value_ per_100,day_basis) the pricedisc function returns the price per 100 face value of a discounted security. figure 13.21 yieldmat and pricemat calculate bonds for which the interest is not paid until matu-rity.us ing financ ial funct ions 382 2 part syntax: yielddisc(settlement_date,maturity_date,price_per_100_face_value,redemption_ value_per_100_of_face_value,day_basis) the yielddisc function returns the annual yield for a discounted security. calculating t-bills treasury bills, which are also referred to as t-bills, are a popular short-term investment. backed by the u.s. government, t-bills are considered one of the safest investments, although they offer a slightly lower interest rate than other types of investments. the federal reserve uses a strange method for advertising the yield on t-bills: the fed compares the total interest to the final value paid on maturity. this is backward from every other bond yield. for example, suppose that you pay 98.70 for a t-bill that will pay 100 on maturity 13 weeks later. the fed expresses the yield by comparing the 1.30 in interest to the 100 final value. every other bond yield compares the 1.30 in interest to the 98.70 invested. the excel tbill functions allow you to compare t-bills and regular bonds. figure 13.23 illustrates the three t-bill functions. figure 13.22 yielddisc and pricedisc calculate dis-counted bonds.383 examples of funct ions for bond investors 13 chapter syntax: tbilleq(settlement_date,maturity_date,discount_rate) the tbilleq function returns the bond-equivalent yield for a t-bill. syntax: tbillprice(settlement_date,maturity_date,discount_rate) the tbillprice function returns the price per 100 face value for a t-bill. syntax: tbillyield(settlement_date,maturity_date,price_per_100_face_value) the tbillyield function returns the yield for a t-bill. figure 13.23 tbilleq and the other tbill functions deal with the irregularities of t-bill investing.us ing financ ial funct ions 384 2 part using accrint or accintm to calculate accrued interest if you are the original buyer of a bond and you buy that bond after the issue date, the bond will have earned some accrued interest during that gap. as the original buyer of the bond, you generally pay this interest back to the issuer when you take possession of the bond. this basically simplifies accounting for the issuer, which can issue identical payments at the next coupon date without hav-ing to worry about dozens of different settlement dates. the accrint function calculates this accrued interest. syntax: accrint(issue,first_interest,settlement,rate,par,frequency,basis) the accrint function returns the accrued interest for a security that pays periodic interest. this function takes the following arguments: • issue—this is the security’s issue date. • first_interest—this is the security’s first interest date. • settlement—this is the security’s settlement date. the security settlement date is the date after the issue date when the security is traded to the buyer. the accrint function calculates the interest that would have been earned between the issue date and the settlement date. • rate—this is the security’s annual coupon rate. • par—this is the security’s par value. if you omit par, accrint uses 1,000. • frequency—this is the number of coupon payments per year. • basis—this is the type of day count basis to use. you use 0 for u.s. bonds if issue is greater than or equal to settlement, accrint returns a #num! error. figure 13.24 demonstrates how the accrued interest changes when the gap between the issue date and settlement date extends. using duration to understand price volatility duration is a measurement, in years, of how long it takes for the price of a bond to be repaid by its cash flows. this measurement is not relevant for zero-coupon bonds because with a zero coupon bond, the duration is simultaneous with the maturity date. suppose that you have a 20-year bond with a 9% yield that pays interest twice a year. it might take about 6 years of interest payments before you earn back the original purchase price of the bond. duration is constantly changing. immediately after a coupon date, the duration goes up slightly because the interest payment is no longer counted as a future cash flow. however, over the life note the accrintm function calcu-lates accrued interest for zero-coupon bonds, as shown in row 18 in figure 13.24 .385 examples of funct ions for bond investors 13 chapter of the bond, the duration gets progressively shorter, until the duration date corresponds with the maturity date. duration is important because the higher the duration, the higher the price volatility for the security. when excel calculates a duration, using the duration function, it uses the method designed by frederick macaulay in the 1930s. this method multiplies the present value of each cash flow by the time it is received. those values are summed and divided by the total price for the security. excel also has a modified duration function, mduration. this function calculates the duration if the yield would increase by 1 percentage point. in figure 13.25, the duration for the 5% yield is 6.879 years. the mduration return is 6.712 years. this is the duration if the yield would change from 5 to 6%. the difference between the duration and modified duration is an indicator of a bond price’s volatility. syntax: duration(settlement_date,maturity_date,coupon_rate,yield_rate,frequency,basis) the duration function returns the macaulay duration for an assumed par value of 100. duration is defined as the weighted average of the present value of the cash flows and is used as a measure of a bond price’s response to changes in yield. figure 13.24 as the original buyer of the bond, you owe the accrued interest to the issuer.us ing financ ial funct ions 386 2 part syntax: mduration(settlement_date,maturity_date,coupon_rate,yield_rate,frequency,basis) the mduration function returns the modified duration for a security with an assumed par value of 100. examples of miscellaneous financial functions excel offers a few other financial functions that may be useful if you are dealing with ancient his-torical data. on april 9, 2001, all u.s. stock markets were forced to start trading securities in dollars and cents instead of dollars and fractions. the united states was the last nation using the fractional system, which was an eighteenth-century system. in the fractional system, a stock price may have been reported in the newspaper as 5 5/8, which is roughly equivalent to 5.63. however, a common system in brokerage houses was to record this as 5.5, with the .5 indicating 5/8. in an alternative system, prices were recorded in 16ths, with, for example, 1.03 meaning 3/16. using dollarde to convert to decimals if you encounter an old worksheet that uses fractional prices, you can convert them to decimals by using dollarde. you must specify the price in the nomenclature of the system and specify whether the number after the decimal point is in 8ths, 16ths, or 32nds. figure 13.25 duration indicates how many years it will take to earn back the security’s purchase price. mduration shows the change in duration if the yield were to increase by 1%.387 examples of mi scel laneous financ ial funct ions 13 chapter syntax: dollarde(fractional_dollar,fraction) the dollarde function converts a dollar price expressed as a fraction into a dollar price expressed as a decimal number. you use dollarde to convert fractional dollar numbers, such as securities prices, to decimal numbers. • fractional_dollar—this is a number expressed as a fraction. • fraction—this is the integer to use in the denominator of the fraction. syntax: dollarfr(decimal_dollar,fraction) the dollarfr function converts a dollar price expressed as a decimal number into a dollar price expressed as a fraction. using fvschedule to calculate the future value for a variable scheduled interest rate the fv function discussed at the beginning of this chapter assumes a constant interest rate. if you have a loan agreement that specifies a variable interest rate for future years, you can calculate the future value based on the scheduled interest rate. to do so, you use the fvschedule function. syntax: fvschedule(principal,schedule) the fvschedule function returns the future value of an initial principal after applying a series of compound interest rates. use fvschedule to calculate the future value of an investment with a vari-able or adjustable rate. this function takes the following arguments: • principal—this is the present value. • schedule—this is an array of interest rates to apply. the values in schedule can be numbers or blank cells; any other value produces a #value! error for fvschedule. blank cells are assumed to be zeros, which means no interest. figure 13.26 shows three examples of variable interest rates.388 2 part figure 13.26 calculating a future value for a series of scheduled future interest rates by using fvschedule.14 using statistical functions statistics in excel fall into three broad categories: • descriptive statistics that describe a data set—these include measures of central tendency and dispersion. • regression tools—these allow you to predict future values based on past values. • inferential statistics—this type of statistic allows you to predict the likeli-hood of an event happening, based on a sample of a population. table 14.1 provides an alphabetical list of all the excel 2010 statistical functions. detailed examples of the functions are provided in the remainder of the chapter. table 14.1 alphabetical list of statistical functions function description avedev (number1,number2,...) returns the average of the absolute devia-tions of data points from their mean. avedev is a measure of the variability in a data set. average (number1,number2,...) returns the average (arithmetic mean) of the arguments. averagea (value1,value2,...) calculates the average (arithmetic mean) of the values in the list of arguments. in addi-tion to numbers, text and logical values, such as true and false, are included in the calculation.us ing stat i st i cal funct ions 390 2 part function description beta.dist (x,alpha,beta,a,b) returns the cumulative beta probability den-sity function. the cumulative beta probability density function is commonly used to study variation in the percentage of something across samples, such as the fraction of the day people spend watching television. beta.inv (probability,alpha,beta,a,b) returns the inverse of the cumulative beta probability density function. that is, if prob-ability is equal to betadist(x,...), then beta.inv (probability,... ) is equal to x. the cumulative beta distribution can be used in project planning to model probable completion times, given an expected comple-tion time and variability. binom.dist (number_s,trials, probability_s,cumulative) returns the individual term binomial dis-tribution probability. you use binom.dist in problems with a fixed number of tests or trials, when the outcomes of any trial are only success or failure, when trials are inde-pendent, and when the probability of success is constant throughout the experiment. for example, binom.dist can calculate the prob-ability that two of the next three babies born will be male. binom.inv (trials,probability_s, alpha) returns the smallest value for which the cumulative binomial distribution is greater than or equal to a criterion value. you use this function for quality assurance applica-tions. for example, you can use binom.inv to determine the greatest number of defec-tive parts that are allowed to come off an assembly line run without having to reject the entire lot. chisq.dist ( x ,degrees_freedom) returns the one-tailed probability of the chi-squared distribution. the chi-squared distribution is associated with a chi-squared test. you use the chi-squared test to compare observed and expected values. for example, in a genetic experiment, you might hypoth-esize that the next generation of plants will exhibit a certain set of colors. by comparing the observed results with the expected ones, you can decide whether your original hypoth-esis is valid.391 14 chapter function description chisq.dist.rt (x,degrees_freedom) returns the right-tailed probability of the chi-squared distribution. chisq.inv (probability,degrees_ freedom) returns the inverse of the one-tailed proba-bility of the chi-squared distribution. if proba-bility is equal to chisq.dist(x,...), then chisq.inv(probability,...) is x. you use this function to compare observed results with expected ones to decide whether your original hypothesis is valid. chisq.inv.rt (probability, degrees_freedom) returns the inverse of the right-tailed prob-ability of the chi-squared distribution. chisq.test (actual_range,expected_ range) returns the test for independence. chisq. test returns the value from the chi-squared distribution for the statistic and the appro-priate degrees of freedom. you can use chi-squared tests to determine whether hypothesized results are verified by an experiment. confidence.norm ( alpha , standard_dev,size) returns the confidence interval for a popula-tion mean. the confidence interval is a range on either side of a sample mean. for exam-ple, if you order a product through the mail, you can determine, with a particular level of confidence, the earliest and latest the prod-uct will arrive. uses standard normal distri-bution. renamed from confidence in 2010. confidence.t ( alpha ,standard_ dev, size ) returns the confidence interval based on the student’s t distribution. correl (array1,array2) returns the correlation coefficient of the array1 and array2 cell ranges. you use the correlation coefficient to determine the relationship between two properties. for example, you can examine the relationship between a location’s average temperature and the use of air conditioners. covariance.p( array1 , array2 ) returns covariance, the average of the prod-ucts of deviations for each data point pair. you use covariance to determine the relation-ship between two data sets. for example, you can examine whether greater income accompanies greater levels of education. based on a population.us ing stat i st i cal funct ions 392 2 part function description covariance.s (array1,array2) returns covariance, the average of the prod-ucts of deviations for each data point pair. you use covariance to determine the relation-ship between two data sets. for example, you can examine whether greater income accompanies greater levels of education. based on a sample. devsq (number1,number2,...) returns the sum of squares of deviations of data points from their sample mean. expon.dist( x , lambda , cumulative ) returns the exponential distribution. you use expon.dist to model the time between events, such as how long a bank’s automated teller machine takes to deliver cash. for example, you can use expon.dist to deter-mine the probability that the process takes, at most, 1 minute. f.dist ( x ,degrees_freedom 1 , degrees_ freedom 2 ) returns the f probability distribution. you can use this function to determine whether two data sets have different degrees of diversity. for example, you can examine test scores given to men and women entering high school and determine whether the vari-ability in the females is different from that found in the males. f.dist.rt (x,degrees_ freedom1,degrees_freedom2) returns the right-tailed f probability distri-bution. f.inv ( probability ,degrees_ freedom 1 ,degrees_freedom 2 ) returns the inverse of the f prob-ability distribution. if probabil-ity is equal to f.dist(x,...), then f.inv(probability,...) is equal to x. f.inv.rt (probability,degrees_ freedom1,degrees_freedom2) returns the inverse of the right-tailed f prob-ability distribution. f.test (array1,array2) returns the result of an f-test. an f-test returns the one-tailed probability that the variances in array1 and array2 are not sig-nificantly different. you use this function to determine whether two samples have differ-ent variances. for example, given test scores from public and private schools, you can test whether those schools have different levels of diversity.393 14 chapter function description fisher( x ) returns the fisher transformation at x. this transformation produces a function that is approximately normally distributed rather than skewed. you use this function to per-form hypothesis testing on the correlation coefficient. fisherinv (y) returns the inverse of the fisher transforma-tion. you use this transformation when ana-lyzing correlations between ranges or arrays of data. if y is equal to fisher(x), then fisherinv(y) is equal to x. forecast (x,known_y’s,known_x’s) calculates, or predicts, a future value by using existing values. the predicted value is a y value for a given x value. the known val-ues are existing x values and y values, and the new value is predicted by using linear regression. you can use this function to pre-dict future sales, inventory requirements, or consumer trends. frequency (data_array,bins_array) calculates how often values occur within a range of values and returns a vertical array of numbers. for example, you can use frequency to count the number of test scores that fall within ranges of scores. because frequency returns an array, it must be entered as an array formula. gamma.dist (x,alpha,beta,cumulat ive) returns the gamma distribution. you can use this function to study variables that may have a skewed distribution. the gamma distribution is commonly used in queuing analysis. gamma.inv (probability,alpha,beta) returns the inverse of the gamma cumula-tive distribution. if probability is equal to gamma.dist(x,...), then gamma. inv(probability,...) is equal to x. gammaln (x) returns the natural logarithm of the gamma function. geomean (number1,number2,...) returns the geometric mean of an array or a range of positive data. for example, you can use geomean to calculate average growth rate given compound interest with variable rates.us ing stat i st i cal funct ions 394 2 part function description growth (known_y’s,known_x’s, new_x’s,const) calculates predicted exponential growth by using existing data. growth returns the y values for a series of new x values that you specify by using existing x values and y val-ues. you can also use the growth worksheet function to fit an exponential curve to exist-ing x values and y values. harmean (number1,number2,...) returns the harmonic mean of a data set. the harmonic mean is the reciprocal of the arithmetic mean of reciprocals. hypgeom.dist (sample_s,number_ sample,population_s,number_popu-lation) returns the hypergeometric distribution. hypgeom.dist returns the probability of a given number of sample successes, given the sample size, population successes, and population size. you use hypgeom.dist for problems with a finite population, where each observation is either a success or a fail-ure, and where each subset of a given size is chosen with equal likelihood. intercept (known_y’s,known_x’s) calculates the point at which a line will inter-sect the y-axis by using existing x values and y values. the intercept point is based on a best-fit regression line plotted through the known x values and known y values. you use the intercept when you want to determine the value of the dependent variable when the independent variable is 0. for example, you can use the intercept function to predict a metal’s electrical resistance at 0 degrees celsius when your data points were taken at room temperature and higher. kurt (number1,number2,...) returns the kurtosis of a data set. kurtosis characterizes the relative peakedness or flat-ness of a distribution compared with the nor-mal distribution. positive kurtosis indicates a relatively peaked distribution. negative kur-tosis indicates a relatively flat distribution. large (array,k) returns the kth largest value in a data set. you can use this function to select a value based on its relative standing. for example, you can use large to return a highest, run-ner-up, or third-place score.395 14 chapter function description linest (known_y’s,known_x’s, const,stats) calculates the statistics for a line by using the least-squares method to calculate a straight line that best fits the data and returns an array that describes the line. because this function returns an array of val-ues, it must be entered as an array formula. logest (known_y’s,known_x’s, const,stats ) in regression analysis, calculates an expo-nential curve that fits the data and returns an array of values that describes the curve. because this function returns an array of val-ues, it must be entered as an array formula. lognorm.dist ( x,mean ,standard_dev) returns the cumulative lognormal distri-bution of x, where ln(x) is normally dis-tributed with the parameters mean and standard_dev. you use this function to analyze data that has been logarithmically transformed. lognorm.inv ( probability,mean , standard_dev) returns the inverse of the lognormal cumula-tive distribution function of x, where ln(x) is normally distributed with the parameters mean and standard_dev. if probability is equal to lognorm.dist(x,...), lognorm. inv(probability,...) is equal to x. max (number1,number2,...) returns the largest value in a set of values. maxa( value1 , value2,... ) returns the largest value in a list of argu-ments. text and logical values such as true and false are compared, as are numbers. median (number1,number2,...) returns the median of the given numbers. the median is the number in the middle of a set of numbers; that is, half the numbers have values that are greater than the median and half have values that are less. min (number1,number2,...) returns the smallest number in a set of values. mina( value1 , value2,... ) returns the smallest value in a list of argu-ments. text and logical values such as true and false are compared, as are numbers.us ing stat i st i cal funct ions 396 2 part function description mode.mult(number1,number2,... ) returns a vertical array of the most fre-quently occurring, or repetitive, value in an array or a range of data. mode.mult is new in excel 2010 and handles the specific case when there are two or more values that are tied for the most frequently occurring value. whereas mode.sngl returns only the first mode value, mode.mult returns all the mode values. mode.sngl(number1,number2,...) returns the most frequently occurring, or repetitive, value in an array or a range of data. like median, mode.sngl is a location measure. mode.sngl is renamed from mode in excel 2010. if there are two values that are tied for the most frequently occurring value, only the first one will be returned by mode. sngl. if you need to return all of the tied val-ues, use the new mode.mult. negbinom.dist( number_f ,number_s, probability_s) returns the negative binomial distribution. negbinom.dist returns the probability that there will be number_f failures before the number_sth success, when the constant probability of a success is probability_s. this function is similar to the binomial dis-tribution function, except that the number of successes is fixed, and the number of trials is variable. as with the binomial distribution function, trials are assumed to be indepen-dent. norm.dist(x,mean,standard_dev, cumulative) returns the normal cumulative distribution for the specified mean and standard devia-tion. this function has a very wide range of applications in statistics, including hypoth-esis testing. norm.inv(probability,mean, standard_dev) returns the inverse of the normal cumulative distribution for the specified mean and stan-dard deviation. norm.s.dist (z) returns the standard normal cumulative distribution function. the distribution has a mean of zero and a standard deviation of one. you use this function in place of a table of standard normal curve areas.397 14 chapter function description norm.s.inv(probability) returns the inverse of the standard normal cumulative distribution. the distribution has a mean of zero and a standard deviation of one. pearson (array1,array2) returns the pearson product–moment cor-relation coefficient, r, a dimensionless index that ranges from –1.0 to 1.0, inclusive, and reflects the extent of a linear relationship between two data sets. percentile.exc (array,k) returns the kth percentile of values in a range. you can use this function to establish a threshold of acceptance. for example, you can decide to examine candidates who score above the 90th percentile. percentile.exc is renamed in excel 2010 and assumes the percentile is between 0 and 1 exclusive. percentile.inc( array , k ) returns the kth percentile of values in a range. percentile.inc is new in excel 2010 and assumes the percentile is between 0 and 1 inclusive. percentrank.exc( array , x , significance ) returns the rank of a value in a data set as a percentage of the data set. this function can be used to evaluate the relative standing of a value within a data set. percentrank.exc is renamed from percentrank. it assumes the percentile is between 0 and 1 exclusive. percentrank. inc (array,x,significance) returns the rank of a value in a data set as a percentage of the data set. this function can be used to evaluate the relative standing of a value within a data set. for example, you can use percentrank.inc to evaluate the standing of an aptitude test score among all scores for the test. percentrank.inc is new in excel 2010. it assumes percentiles from 0 to 1 inclusive. permut ( number ,number_chosen) returns the number of permutations for a given number of objects that can be selected from number objects. a permutation is any set or subset of objects or events where internal order is significant. permutations are different from combinations, for which the internal order is not significant. you use this function for lottery-style probability calcula-tions.us ing stat i st i cal funct ions 398 2 part function description poisson.dist (x,mean,cumulative) returns the poisson distribution. a common application of the poisson distribution is pre-dicting the number of events over a specific time, such as the number of cars arriving at a toll plaza in one minute. prob (x_range,prob_range,lower_ limit,upper_limit) returns the probability that values in a range are between two limits. if upper_limit is not supplied, returns the probability that val-ues in x_range are equal to lower_limit. quartile.exc (array,quart) returns the quartile of a data set. quartiles are often used in sales and survey data to divide populations into groups. for example, you can use quartile.exc to find the top 25% of incomes in a population. this function is renamed from quartile in excel 2010. it assumes percentiles run from 0 to 1 exclu-sive. quartile.inc (array,quart) returns the quartile of a data set. quartiles are often used in sales and survey data to divide populations into groups. this function is new in excel 2010 and assumes percen-tiles run from 0 to 1 inclusive. rank.avg (number,ref,order) returns the rank of a number in a list of num-bers. the rank of a number is its size relative to other values in a list. (if you were to sort the list, the rank of the number would be its position.) when two or more items are tied, rank.avg will average their ranks. rank.eq (number,ref,order) returns the rank of a number in a list of numbers. when two or more items are tied, rank.eq will assign the lower rank to all items in the tie. renamed from rank in excel 2010. rsq (known_y’s,known_x’s) returns the square of the pearson product– moment correlation coefficient through data points in known_y’s and known_x’s. the r-squared value can be interpreted as the proportion of the variance in y attributable to the variance in x.399 14 chapter function description skew (number1,number2,...) returns the skewness of a distribution. skewness characterizes the degree of asym-metry of a distribution around its mean. positive skewness indicates a distribution with an asymmetric tail extending toward more positive values. negative skewness indicates a distribution with an asymmetric tail extending toward more negative values. slope (known_y’s,known_x’s) returns the slope of the linear regression line through data points in known_y’s and known_x’s. the slope is the vertical distance divided by the horizontal distance between any two points on the line, which is the rate of change along the regression line. small (array,k) returns the kth smallest value in a data set. you use this function to return values with a particular relative standing in a data set. standardize (x,mean,standard_dev) returns a normalized value from a distribu-tion characterized by mean and standard_ dev. stdev.p (number1,number2,...) calculates standard deviation based on the entire population given as arguments. the standard deviation is a measure of how widely values are dispersed from the average value (that is, the mean). stdev.s (number1,number2,...) estimates standard deviation based on a sample. the standard deviation is a measure of how widely values are dispersed from the average value (that is, the mean). stdeva (value1,value2,...) estimates standard deviation based on a sample. the standard deviation is a measure of how widely values are dispersed from the average value (that is, the mean). text and logical values such as true and false are included in the calculation. stdevpa (value1,value2,...) calculates standard deviation based on the entire population given as arguments, includ-ing text and logical values. the standard deviation is a measure of how widely values are dispersed from the average value (that is, the mean).us ing stat i st i cal funct ions 400 2 part function description steyx (known_y’s,known_x’s) returns the standard error of the predicted y value for each x in the regression. the standard error is a measure of the amount of error in the prediction of y for an individual x. sumsq (number1,number2, ...) returns the sum of the squares of the argu-ments. sumx2my2 (array_x,array_y) returns the sum of the difference of squares of corresponding values in two arrays. sumx2py2 (array_x,array_y) returns the sum of the sum of squares of corresponding values in two arrays. the sum of the sum of squares is a common term in many statistical calculations. sumxmy2 (array_x,array_y) returns the sum of squares of differences of corresponding values in two arrays. t.dist (x,degrees_freedom,tails) returns the percentage points (that is, prob-ability) for the student t-distribution where a numeric value ( x) is a calculated value of t for which percentage points are to be computed. the t-distribution is used in the hypothesis testing of small sample data sets. you use this function in place of a table of critical val-ues for the t-distribution. t.dist.2t ( x ,degrees_freedom) returns the two-tailed probability for the student t-distribution. new in excel 2010. t.dist.rt ( x ,degrees_freedom) returns the right-tailed probability) for the student t-distribution. new in excel 2010. t.inv (probability, degrees_freedom) returns the t-value of the student’s t-distri-bution as a function of the probability and the degrees of freedom. t.inv.2t (probability, degrees_freedom) returns the right-tailed t-value of the student’s t-distribution as a function of the probability and the degrees of freedom. t.test (array1,array2, tails,type ) returns the probability associated with a student’s t-test. you use t.test to deter-mine whether two samples are likely to have come from the same two underlying popula-tions that have the same mean.401 examples of funct ions for descr ipt ive stat i st i cs 14 chapter function description trend (known_y’s,known_x’s,new_x’s, const) returns values along a linear trend. fits a straight line (using the method of least squares) to the arrays known_y’s and known_x’s. returns the y values along that line for the array of new_x’s that you specify. trimmean (array,percent) returns the mean of the interior of a data set. trimmean calculates the mean taken by excluding a percentage of data points from the top and bottom tails of a data set. you can use this function when you want to exclude outlying data from your analysis. var.p (number1,number2,...) calculates variance based on the entire population. var.s (number1,number2,...) estimates variance based on a sample. vara (value1,value2,...) estimates variance based on a sample. in addition to numbers, text and logical values such as true and false are included in the calculation. varpa (value1,value2,...) calculates variance based on the entire population. in addition to numbers, text and logical values such as true and false are included in the calculation. weibull.dist.dist (x,alpha,beta, cumulative) returns the weibull distribution. you use this distribution in reliability analysis, such as to calculate a device’s mean time to failure. z.test (array,x,sigma) returns the two-tailed p value of a z-test. the z-test generates a standard score for x with respect to the data set, array, and returns the two-tailed probability for the nor-mal distribution. you can use this function to assess the likelihood that a particular obser-vation is drawn from a particular population. examples of functions for descriptive statistics descriptive statistics help describe a population of data. what is the largest? the smallest? the average? are data points grouped to the left of the average or to the right of the average? how wide is the range of expected values? do many members of the population have values in the middle, or are they evenly spread throughout the range? all these are measures of descriptive statistics. many situations in a business environment involve finding basic information about a data set, such as the largest or smallest values or the rank within a data set.us ing stat i st i cal funct ions 402 2 part using min or max to find the smallest or largest numeric value if you have a large data set and want to find the smallest or largest value in a column, rather than sort the data set, you can use a function to find the value. to find the smallest numeric value, you use min. to find the largest numeric value, you use max. figure 14.1 shows a list of open receivables, by customer, for 59 customers. even though the func-tion references says that you can find the min for only 255 numbers, a single rectangular reference counts as one of the 255 arguments for the function. to find the smallest value in the range, you use min(b2:b360). to find the largest value in the range, you use max(b2:b360). figure 14.1 you use min and max to find the smallest or largest receivables. syntax: min( number1,number2,... ) the min function returns the smallest number in a set of values. the arguments number1, number2,... are 1 to 255 numbers for which you want to find the minimum value. you can spec-ify arguments that are numbers, empty cells, logical values, or text representations of numbers. arguments that are error values or text that cannot be translated into numbers cause errors. if an argument is an array or a reference, only numbers in that array or reference are used. empty cells, logical values, or text in the array or reference are ignored. if logical values and text should not be ignored, you should use mina instead. if the arguments contain no numbers, min returns 0. syntax: max( number1,number2,... ) the max function returns the largest value in a set of values. the arguments number1, number2,... are 1 to 255 numbers for which you want to find the maximum value. the remaining rules are similar to those for min, described in the preceding section. if you read the descriptions for mina and maxa, you might think that the functions can be used to find the smallest text value in a range. however, here is the excel help description for maxa: • maxa(value1,value2) returns the largest value in a list of arguments. text and logical val-ues such as true and false are compared as well as numbers.403 examples of funct ions for descr ipt ive stat i st i cs 14 chapter the problem, however, is that text values are treated as the number 0 in the compare. it is a struggle to imagine a scenario where this would be mildly useful. if you have a series of positive numbers and want to know if any of them are text, you can use mina(a1:a99). if the result is 0, then you know that there is a text value in the range. similarly, if you have a range of negative numbers in a1:a99, you could use maxa(a1:a99). if any of the values are text, the result will return 0 instead of a negative number. mina and maxa could be used to evaluate a series of true/false values. false values are treated as 0. true values are treated as 1. using large to find the top n values in a list of values the max function discussed in the preceding section finds the single largest value in a list. sometimes it is interesting to find the top 10 values in a list. say that with a list of customer receiv-ables, someone in accounts receivable may want to call the top 10 receivables in an attempt to col-lect the accounts. the large function can find the first, second, third, and so on largest values in a list. syntax: large( array,k ) the large function returns the kth largest value in a data set. you can use this function to select a value based on its relative standing. for example, you can use large to return a highest, runner-up, or third-place score. this function takes the following arguments: • array—this is the array or range of data for which you want to determine the kth largest value. if array is empty, large returns a #num! error. • k—this is the position (from the largest) in the array or cell range of data to return. if k is less than or equal to 0 or if k is greater than the number of data points, large returns a #num! error. follow these steps to build a table of the five largest customer receivables: 1. make the second argument of the function the numbers 1 through 5. starting from the data set shown in figure 14.1, insert a new column a to hold the values 1 through 5. 2. in a66:a70, enter the numbers 1 through 5, as shown in figure 14.2. 3. in the column letters above the grid, grab the line between columns a and b. drag to the left to make this column narrower. it should be just wide enough to display the numbers in column a. 4. in column c, row 66, enter large(. use the mouse or arrow keys to highlight the range of data. after highlighting the data, press the f4 key to add dollar signs to the reference. this allows you to copy the reference to the next several rows while always pointing at the same range. 5. for the second argument, point to the 1 in cell a66. leave this reference as relative (that is, no dol-lar signs) so that it will change to a67, a68, and so on when copied. the first formula in cell c66 indicates that the largest value is 13,560.43. so far, you’ve done a lot of work just to find out the same thing that the max function could have told you. however, the power comes in the next step.us ing stat i st i cal funct ions 404 2 part 6. select cell c66. click the fill handle and drag down to cell c70. you now have a list of the top five open receivables. 7. at this point, you know the amounts of the top receivables, but this immediately brings up the question of which customers have those receivables. using lookup functions discussed in chapter 12, “using powerful functions: logical, lookup, and database functions,” you can retrieve the name associated with each receivable amount. note that this method assumes that no two customers in the top five have exactly the same receivable. 8. enter the following intermediate formula in cell b66: match(c66,c2:c60,0). this for-mula tells excel to take the receivable value in cell c66 and to find it in the list of open receiv-ables. the match function returns the row number within c2:c60 that has the matching value. for example, 13,560.43 is found in cell c9. this is the eighth row in the range of c2:c60, so match returns the number 8. 9. the largest receivable in the eighth row of a range is not useful to a person trying to collect accounts receivables, so to return the name, ask for the eighth value in the range of b2:b66. you can use the index function to do this. index(b2:b66,8) returns the customer with the largest receivable. 10. combine the formulas from step 8 and step 9 into a single formula in cell b66: index(b2:b 60,match(c66,c2:c60,0)). 11. copy the formula in cell b66 down through cell b70. as shown in figure 14.2, the result is a table in a66:a70 that shows the five largest customers. after receiving checks today, you can update the receivable amounts in c2:c60. if best raft sent in a check for 10,000, the formulas would automatically move magnificent electronics up to the fourth position and move the sixth customer up to the fifth spot. rather than adding the numbers 1 through 5 in a66:a70, you could use the row() function to return the values of 1 to 5. in cell c66, use large(c2:c60,row(a1)). because the row num-ber of cell a1 is 1, the row function will return a 1 as the second argument to the large function. this method has the advantage that as you drag the formula down, it will switch to row(a2) for 2, row(a3) for 3, and so on. figure 14.2 the large function in column c allows this dynamic table to be built to show the five largest problems.405 examples of funct ions for descr ipt ive stat i st i cs 14 chapter using small to sequence a list in date sequence the min function finds the smallest value in a data set. the small function can find the kth smallest value. this can be great for finding not just the smallest value but the second-smallest, third-small-est, and so on. if n is the number of data points in an array, small(array,1) equals the smallest value, and small(array,n) equals the largest value. syntax: small( array,k ) the small function returns the kth smallest value in a data set. you use this function to return val-ues with a particular relative standing in a data set. array is an array or a range of numeric data for which you want to determine the kth smallest value. if array is empty, small returns a #num! error. k is the position (from the smallest) in the array or range of data to return. if k is less than or equal to 0 or if k exceeds the number of data points, small returns a #num! error. in figure 14.3, range a2:b19 contains a list of book titles and their publication dates. to find the earliest dates for the books, you use small(). this example contains a twist that makes the formula easier than in the example for large. in the initial formula in cell d2, the argument for k was generated using row(a1). this function returns the number 1. as the formula is copied from cell d2 down to the remaining rows, the reference changes to row(a2) and so on. this allows each row in column d to show a successively larger value from array. the formula in cell d2 is small(b2:b19,row(a1)). after you have found the year in column d, the formula in cell e2 to return the title is index(a2:a19,match(d2,b2:b19,0)). figure 14.3 the small function in column d finds the earliest years in the list.us ing stat i st i cal funct ions 406 2 part using median, mode.sngl, mode.mult, and average to find the central tendency of a data set you can use three popular measures when trying to find the middle scores in a range: 1. mean—the mean of a data set is the mathematical average. it is calculated by adding all the values in the range and dividing by the number of values in the set. to calculate a mean in excel, use the average function. 2. median—the median of a data set is the value in the middle when the set is arranged from high to low. in the data set, half the values are higher than the median and half the numbers are lower than the median. to calculate a median in excel, use the median function. 3. mode—the mode of a data set is the value that happens most often. to calculate a mode in excel 2010, use the mode.sngl or mode.mult functions. syntax: average( number1,number2,... ) the average function returns the average (that is, arithmetic mean) of the arguments. the arguments number1, number2,... are 1 to 255 numeric arguments for which you want the aver-age. the arguments must be either numbers or names, arrays, or references that contain numbers. if an array or a reference argu-ment contains text, logical values, or empty cells, those values are ignored; however, cells containing the value 0 are included. if you have a range of true/false values and you want to see what percentage of people answered true, you can use averagea() of the range. the averagea function will treat true values as 1 and false values as zero. syntax: median( number1,number2,... ) the median function returns the median of the given numbers. the median is the number in the middle of a set of numbers; that is, half the numbers have values that are greater than the median and half have values that are less. if there is an even number of numbers in the set, median calcu-lates the average of the two numbers in the middle. the arguments number1, number2,... are 1 to 255 numbers for which you want the median. the arguments should be either numbers or names, arrays, or references that contain numbers. microsoft excel examines all the numbers in each reference or array argument. if an array or a refer-ence argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. caution when averaging cells, keep in mind the difference between empty cells and those that contain the value 0. this can be particularly troubling if you have cleared the show a zero in cells that have a zero value check box. you find this set-ting by selecting file, options, advanced, display options for this worksheet.407 examples of funct ions for descr ipt ive stat i st i cs 14 chapter syntax: mode.sngl( number1,number2,... ) the mode.sngl function returns the most frequently occurring, or repetitive, value in an array or a range of data. like median, mode.sngl is a location measure. in a set of values, the mode is the most frequently occurring value; the median is the middle value; and the mean is the average value. no single measure of central tendency provides a complete picture of the data. suppose data is clus-tered in three areas, half around a single low value, and half around two large values. both average and median may return a value in the relatively empty middle, and mode.sngl may return the domi-nant low value. the arguments number1, number2,... are 1 to 255 arguments for which you want to calculate the mode. you can also use a single array or a reference to an array instead of arguments separated by commas. the arguments should be numbers, names, arrays, or references that contain numbers. if an array or reference argu-ment contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if the data set contains no duplicate data points, mode returns a #n/a error. it is possible to have multiple values that tie as the mode. in the mode.sngl calculation, when two values tie as the mode, only the first mode value appears as the result. in figure 14.4, the mode. sngl in e4 is reported as 88, even though both 88 and 82 appear most frequently in the data set. microsoft added mode.mult to excel 2010 to handle the situation where multiple values tie as the mode. figure 14.4 shows examples of average, mean, and mode.sngl. cell e2 calculates the arithmetic mean of the test scores in column b: 81.4444. the median in cell e3 is higher: 82.5. this means that half the students scored above 82.5 and half scored below 82.5. the mode in cell e4 as reported by mode.sngl is 88. a formula in e5 indicates that 82 is also tied as a mode. this is because 82 and 88 each appeared three times in the data set. the range in figure 14.4 demonstrates two anomalies with the median and mode. in this case, there are an even number of entries—18. it is impossible to figure out a median in this case, so excel takes the average of the two values in the middle—82 and 83—to produce 82.5. this is the only situ-ation in which the median is not a value from the table. there are also two modes in the table. both 88 and 82 appear three times. mode.sngl reports 88 as the mode because it encounters the 88 in b8 before it encounters the 82 in b9. this is rather arbi-trary and the mode.sngl would change if the data were sorted in ascending sequence. read on to see how mode.mult can report all the mode values. note mode.sngl is a new function name introduced in excel 2010. for backward compatibility, use the mode function.us ing stat i st i cal funct ions 408 2 part syntax: mode.mult( number1,number2,... ) the mode.mult function returns a vertical array of the most frequently occurring, or repetitive, value in an array or a range of data. mode.mult has been added to excel 2010 to specifically address the situations where two or more values tie as the mode. the mode.mult function will return a vertical array of values as the answer. because mode.mult can return multiple values, you might think that you should enter the function in several cells and use ctrlshiftenter to enter the formula. while this works, the unpredictability of the number of values returned by mode.mult makes this a dicey proposition. in figure 14.5, you will see four different cases with mode.mult. • in column b, five values each occur twice, creating a five-way tie for the mode. select cells c3:c7, type mode.mult(b3:b12) and hold down ctrlshift while pressing enter. this enters one formula in those five cells. this works out great; five val-ues are returned and they fill the five cells where the formula is entered. • the first case has been copied to columns e:f. cell e12 is changed from 5 to 6. this creates a four-way tie for the mode. mult. because the formula is entered in five cells, you get the four-way tie as the first four cells and then #n/a as the fifth cell. this makes sense. there are a number of ways to deal with the #n/a value. figure 14.4 average, median, and mode.sngl all describe the central tenden-cies of a data set. caution mode.mult is a new func-tion in excel 2010. there is no equivalent function in excel 2007 and earlier. if this workbook is opened in legacy versions of excel, the cell will calculate as a #name? error.409 examples of funct ions for descr ipt ive stat i st i cs 14 chapter • in column h, all 10 values appear exactly once. excel help warns that if no value appears two or more times, the answer for mode will be #n/a. the results are all #n/a because there is officially no mode. • in column k, the “normal” case of having one mode causes all sorts of problems. because mode. mult returns a one-cell answer, the array formula assumes that you must want to expand that one-cell answer over the entire range where the formula is entered, so you get five 1s as the answer. figure 14.5 mode.mult is challenging to use. in row 14 of figure 14.5, a nonarray formula counts how many results the mode.mult function will return. using count(mode.mult(b3:b12)) is probably the best way to go. to return the first mode, you can use mode.sngl(range) to see if there is a two-way tie, use if(count(mode.mult(range))1,index(mode.mult(range),2),””) to see if there is a three-way tie, use if(count(mode.mult(range))2,index(mode.mult(range),3),””) you can continue this pattern for as many possible modes as you might expect. in an n-row data set, there might be as many as n/2 possible modes! using trimmean to exclude outliers from the mean sometimes a data set includes a few outliers that radically skew the average. for example, suppose you have a list of gross margin percentages. most percentages fall in the 45% to 50% range, but there was one deal where for customer satisfaction reasons, the product was given away at a loss. this one data point would skew the average unusually low.us ing stat i st i cal funct ions 410 2 part the trimmean function takes the mean of data points but excludes the n% highest and lowest val-ues. you have to use some care in expressing the n%. syntax: trimmean( array,percent ) the trimmean function returns the mean of the interior of a data set. trimmean calculates the mean taken by excluding a percentage of data points from the top and bottom tails of a data set. you can use this function when you want to exclude outlying data from your analysis. this function takes the following arguments: • array —this is the array or range of values to trim and average. • percent—this is the fractional number of data points to exclude from the calculation. for exam-ple, if percent is 0.2, 4 points are trimmed from a data set of 20 points (that is, 20 0.2): 2 from the top and 2 from the bottom of the set. if percent is less than 0 or percent is greater than 1, trimmean returns a #num! error. trimmean rounds the number of excluded data points down to the nearest multiple of 2. if percent equals 0.1, 10% of 30 data points equals 3 points. for symmetry, trimmean excludes a single value from the top and bottom of the data set. using geomean to calculate average growth rate suppose that your 401(k) plan is invested in a stock market index fund. the stock market goes up 5%, 40%, and 15% in three successive years. taking the average of these numbers might lead some-one to believe that the average increase was 20% per year. this is not correct. the growth rates are all multiplied together to find an ending value of your investment. to find the average growth rate, you need to find a number that, when multiplied together three times, yields the same result as 105% 140% 115%. you can calculate this by using geomean. to find the geometric mean of 10 numbers, you multiply the 10 numbers together and raise the sum to the 1/10 power. excel lets you do this quickly with geomean. syntax: geomean(number1, number2,...) the geomean function returns the geometric mean of an array or a range of positive data. for example, you can use geomean to calculate average growth rate, given compound interest with variable rates. the arguments number1,number2,... are 1 to 255 arguments for which you want to calculate the mean. you can also use a single array or a reference to an array instead of arguments separated by commas. caution the arguments must be either numbers or names, arrays, or ref-erences that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored. however, cells that contain the value 0 are included. if any data point is less than or equal to 0, geomean returns a #num! error.411 examples of funct ions for descr ipt ive stat i st i cs 14 chapter using harmean to find average speeds the typical averaging function fails when you are measuring speeds over a period of time. suppose that your exercise regimen is 5 minutes of walking at 2 mph, 25 minutes of running at 5 mph, and then 10 minutes of jogging at 3 mph. if you took the average of (2, 5, 5, 5, 5, 5, 3, 3), you would assume that you averaged 4.125 miles per hour. the actual calculation for average speed would be to take the reciprocals of each speed, average those values, and then take the reciprocal of the result. in the exercise example, you would average ( 1 /2, 1 /5, 1 /5, 1 /5, 1 /5, 1 /5, 1 /3, 1 /3, ) to obtain 13 /48. the you would take the reciprocal, 48 /13 to find the actual average speed of 3.69 mph. syntax: harmean( number1,number2,... ) the harmean function returns the harmonic mean of a data set. the harmonic mean is the reciprocal of the arithmetic mean of reciprocals. the arguments number1,number2,... are 1 to 255 arguments for which you want to calculate the mean. you can also use a single array or a reference to an array instead of arguments separated by commas. the arguments must be either numbers or names, arrays, or references that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if any data point is less than or equal to 0, harmean returns a #num! error. the harmonic mean is always less than the geometric mean, which is always less than the arithmetic mean. using averageif or averageifs 1. excel 2007 included two new conditional calculation functions: averageif and averageifs. these functions find the mean of records that match one or more criteria. syntax: averageif( range,criteria,average_range ) averageif returns the arithmetic mean of all the cells in a range that meet a given criteria. • range—one or more cells to average, including numbers or names, arrays, or references that contain numbers. • criteria—the criteria in the form of a number, expression, cell reference, or text that defines which cells are averaged. for example, criteria can be expressed as 32, “32”, “32”, “apples”, or b4. • average_range—the actual set of cells to average. if omitted, range is used.us ing stat i st i cal funct ions 412 2 part syntax: averageifs( average_range,criteria_range1, criteria1,criteria_ range2,criteria2... ) averageif returns the arithmetic mean of all cells that meet multiple criteria. • average_range—one or more cells to average, including numbers or names, arrays, or refer-ences that contain numbers. • criteria_range1, criteria_range2, ...—1 to 127 ranges in which to evaluate the associated criteria. • criteria1, criteria2, ...—1 to 127 criteria in the form of a number, expression, cell reference, or text that defines which cells will be averaged. for example, criteria can be expressed as 32, “32”, “32”, “apples”, or b4. using rank to calculate the position within a list at times you need to determine the order of values, but you are not allowed to sort the data. the rank function helps with this task. suppose five bowlers scored 187, 185, 185, 170, and 160. the traditional way to rank the players is that two players would have a rank of 2, and the next player would have a rank of 4. no one would be ranked number 3. although this is technically correct, it can cause problems if you have lookup values expecting to find a person ranked number 3. the example at the end of this section explains how to overcome such a situation. the clever excellers who hoped to use rank to sort a list using formulas really want rank to return one of every rank. in excel 2010, microsoft renamed the old rank as rank.eq. it added a new rank called rank.avg. in the same situation with the five bowlers, both of the scores of 185 would get a rank of 2.5. a rank of 2.5 is the average of the ranks of 2 and 3. syntax: rank.eq( number,ref,order ) rank.avg( number,ref,order ) neither rank.eq nor rank.avg will calculate in excel 2007 or earlier. in legacy versions of excel, use the rank function to calculate rank.eq. there was no equivalent of rank.avg in legacy ver-sions of excel. the rank functions returns the rank of a number in a list of numbers. the rank of a number is its size relative to other values in a list. if you were to sort the list, the rank of the number would be its position. this function takes the following arguments: note excel gurus have been complain-ing about an anomaly with the rank function. apparently, a bunch of scientists have also been complaining about rank. microsoft fixed the rank function in excel 2010. but they listened to the complaints of the scien-tists instead of the gurus. so, now there are two rank functions and neither one is going to make the excel pros happy.413 examples of funct ions for descr ipt ive stat i st i cs 14 chapter • number—this is the number whose rank you want to find. • ref—this is an array of, or a reference to, a list of numbers. nonnumeric values in ref are ignored. • order—this is a number that specifies how to rank number. for a value of 0 or if this argument is omitted, excel ranks number as if ref were a list sorted in descending order. if order is any nonzero value, excel ranks number as if ref were a list sorted in ascending order. rank.eq gives duplicate numbers the same lower rank. however, the presence of duplicate num-bers affects the ranks of subsequent numbers. for example, in a list of integers, if the number 10 appears twice and has a rank of 5, then 11 would have a rank of 7 (no number would have a rank of 6.) rank.avg gives duplicate numbers the same rank by averaging the ranks of the next two positions. in the same example, if the number 10 appears twice and would hold the #5 and #6 positions, both of the 10s would receive an average rank of 5.5. in figure 14.6, column b contains a list of scores. the formula for cell c2 is rank. eq(b2,b2:b13). notice that the third argument is omitted, so the highest score will be ranked as number 1. also notice that the second argument is marked as absolute so that the formula can be copied, and it will always point to the same ref range. figure 14.6 in this case, rank.eq and rank. avg return different values when a tie occurs. column d uses the new rank.avg to rank the scores. note the differences in rows 8 & 9. the tied values both receive a lower rank of 11 with rank.eq and a value of 11.5 with rank.avg. if you need the lowest value to be ranked as number 1, add a third argument of 1 to indicate that the lowest number is the best. for example: rank(b2,b2:b8,1). a common excel trick is to use the ranking function combined with vlookup or match to sort a range with a formula. you might assign ranks and then use vlookup to find the people who are ranked first, second, and third. the vlookup function certainly is not expecting two people to be ranked at #2 as rank.eq would do. it definitely would never expect the duplicate 2.5 values that rank.avg would return. the gen-us ing stat i st i cal funct ions 414 2 part erally accepted solution is to use rank.eq and then add a countif function that checks to see how many rows above this row have the identical value. in figure 14.7, examine the formula in cell c8. countif asks how many times the value in cell b8 was found in b2:b7. this final reference is an interesting reference. it tells excel to count always from row 2 down to the row above the current row. it is easier to build this formula in the final cell of the column and then copy it upward. figure 14.7 you use a countif to break ties. more than you ever wanted to know about the controversy of percentiles and quartiles a huge argument is raging over the best way to calculate percentiles and quartiles. suppose that you have 11 scores. you ask excel to tell you the score at the 15 th percentile. typically, excel assigns the smallest value to the 0 th percentile and the largest value to the 100 th percentile. because there are 10 steps between the smallest and largest value, the 10 th percentile will be the second-smallest number, and the 20 th percentile will be the third-smallest number. what number is at the 15 th percentile? how about at the 17 th percentile? excel divides the gap between the sec-ond and third values into 10 equal parts and uses that interpolation to calculate the number at the 11 th percentile and so on. in 1996, two scholars named hyndman and fan published a paper detailing 12 different methods for calculating percentiles and quartiles. legacy versions of excel used method #7, which was defined by gumbull. other software such as minitab and spss used method #6, which was defined by weibull. people who care a lot about percentiles and quartiles will talk about “hyndman and fan method # x.” • i f you want to read all the details, check out http://www.daheiser.info/excel/notes/ note%20n.pdf for a comparison of the methods. in excel 2010, the hyndman and fan method #7 is still available as quartile.inc, percentile. inc and percentrank.inc. excel 2010 introduces support for hyndman and fan method #6 as quartile.exc, percentile.exc, and percentrank.exc.415 examples of funct ions for descr ipt ive stat i st i cs 14 chapter here is what you have to know about the controversy: with a data set of 100 numbers or more, the difference between .exc and .inc versions will be small—less than one-half of one percent. however, in small data sets of n4 to n7, the values at the first quartile can swing by 40%. for example, in the data set of {10,22,33,40}, weibull calculates the first quartile at 13 and gumbull calculates the first quartile at 19. the delta between 19 and 13 is 6, which is 46% of 13. also, in the .exc version, there is no 0 th percentile and no 100 th percentile. the .exc stands for percentiles from 0% to 100% exclusive. the .inc stands for percentiles from 0% to 100% inclusive. using quartile.inc to break a data set into quarters use quartile.inc to divide populations into groups. syntax: quartile.inc( array,quart ) quartile.exc( array,quart ) the old quartile function is included in excel for compatibility only. quartile.inc is the renamed version of quartile. the quartile.inc function returns the quartile of a data set. quartiles are often used in sales and survey data to divide populations into groups. for example, you can use quartile.inc to find the top 25% of incomes in a population. these functions take the following arguments: • array—this is the array or cell range of numeric values for which you want the quartile value. if array is empty, quartile returns a #num! error. • quart—this indicates which value to return. you use 0 for the minimum value, 1 for the first quartile (25th percentile), 2 for the median value (50th percentile), 3 for the third quartile (75th percentile), and 4 for the maximum value. if quart is not an integer, it is truncated. if quart is less than 0 or if quart is greater than 4, quartile returns a #num! error. in figure 14.8, the formulas in b20:c23 break out the limits for each quartile. the formula in cell b20 is quartile.inc(b2:b17,0) to find the minimum value. the formula in cells c20 and b21 is quartile.inc(b2:b17,1) to define the end of the first quartile and the start of the second quartile. after the quartile.inc functions build the table in b20:c23, the vlookup function returns the text in c2:c17. the formula in cell c2 is vlookup(b2,b20:d23,3,true). using percentile.inc to calculate percentile the quartile.inc function is fine if you are trying to find every record that is in the top 25% of a range. sometimes, however, you need to find some other percentile. for example, all employ-ees ranked above the 81st percentile may be eligible for a bonus this year. you can use the percentile.inc function to determine the threshold for any percentile. note min, median, and max return the same value as quartile.inc when quart is equal to 0, 2, and 4, respectively.us ing stat i st i cal funct ions 416 2 part syntax: percentile.inc( array,k ) percentile.exc( array,k ) the excel 2007 percentile function is included in excel 2010 for compatibility with older versions. in excel 2010, the percentile.inc function is equivalent to percentile. the new percentile. exc function debuts in excel 2010. the percentile.inc function returns the kth percentile of values in a range. you can use this func-tion to establish a threshold of acceptance. for example, you can decide to examine candidates who score above the 90th percentile. this function takes the following arguments: • array—this is the array or range of data that defines relative standing. if array is empty, percentile.inc returns a #num! error. • k—this is the percentile value in the range 0...1, inclusive. if k is nonnumeric, percentile.inc returns a #value! error. if k is less than 0 or if k is greater than 1, percentile.inc returns a #num! error. if k is not a multiple of 1 / (n – 1), percentile.inc interpolates to determine the value at the kth percentile. in figure 14.9, 33 employees are in column a. their ratings on an annual review are shown in column b. the formula in cell f3, percentile.inc(b2:b34,f2), calculates the level of the 81st percentile. after you determine the particular percentile, you can mark all the qualifying employees by using the formula b2f3 in cells c2:c33. figure 14.8 the quartile.inc function can break up a data set into four equal pieces.417 examples of funct ions for descr ipt ive stat i st i cs 14 chapter using percentrank.inc to assign a percentile to every record suppose you have a database of students in a graduating class. each student has a certain grade point average. to determine each student’s standing in the class, you use the percentrank.inc function. syntax: percentrank.inc( array,x,significance ) percentrank.exc( array,x,significance ) the excel 2007 percentrank function is being replaced by percentrank.inc. excel 2010 debuts the new percentrank.exc function. the percentrank.inc function returns the rank of a value in a data set as a percentage of the data set. this function can be used to evaluate the relative standing of a value within a data set. for example, you can use percentrank.inc to evaluate the standing of an aptitude test score among all scores for the test. this function takes the following arguments: • array—this is the array or range of data with numeric values that defines relative standing. if array is empty, percentrank.inc returns a #num! error. • x—this is the value for which you want to know the rank. if x does not match one of the values in array, percentrank.inc interpolates to return the correct percentage rank. • significance—this is an optional value that identifies the number of significant digits for the returned percentage value. if it is omitted, percentrank.inc uses three digits (that is, 0.xxx). if significance is less than 1, percentrank.inc returns a #num! error. this function is slightly different from rank, so use caution. typically, rank and other functions would ask for x as the first argument and array as the second argument. if you use this function and everyone is assigned to the 100% level, you might have reversed the arguments. the excel help is a bit misleading with regard to significance. the help topic indicates that a significance of 3 gener-ates a value accurate to 0.xxx%. in fact, a significance of 3 returns xx.x%. in figure 14.10, the students’ gpas are in b3:b302. the rank for the first student is percentrank. inc(b3:b302,b2,3). note that percentrank.inc always starts with the lowest score at the lowest percentile. figure 14.9 unlike quartile.inc, the percentile.inc function can determine the breaking point for any particular percentile.us ing stat i st i cal funct ions 418 2 part the table in f7:g12 shows the actual behavior of the significance argument. the values in column g show the percentrank.inc of cell b2 to the significance in column f. you can see that the student ranked at the 99.6th percentile is in the 90th percentile when the significance is 1. a significance of 1 would assign 30 records to be at the 90th percentile. figure 14.10 also shows the difference between percentrank.inc and percentrank.exc. although christopher moon is at the 0.0 th percentile in cell c302 using percentrank.inc, the .exc version never assigns a value to the 0.0 percentile. cell d302 shows a percentile of 0.3% for the lowest score in a 300 row data set. using avedev, devsq, var.s, and stdev.s to calculate dispersion functions such as average tell you about the center of a range of data. seeing the center is not always the entire picture. the other key element of descriptive statistics is dispersion. if you have a population, the average height might be x. if you look at dispersion, you can find out if every mem-ber of the population is tightly grouped around the average or if there is wide variability. here are several measures of dispersion: • average deviation is calculated by measuring the absolute difference of each data point from the mean and then averaging these values. suppose the values in a population are 12, 14, 16, 18, and 20. the mean is 16. average deviation adds up 4, 2, 0, 2, and 4 and divides the total by 5 to yield 2.4. excel offers avedev to calculate this. • average deviation is not perfect. suppose you have another population of 11, 15, 16, 17, and 21. again, the mean is 16. the average deviation averages 5, 1, 0, 1, and 5 to yield an average devia-tion of 2.4. if you want to measure how far from the mean the points range, you can add up the squares of each deviation. in this case, the square of 5 is 25, and it indicates more dispersion than the square of 4. excel offers devsq to calculate the squares of each deviation. figure 14.10 the percentrank.inc func-tion assigns percentile values to an array of values.419 examples of funct ions for descr ipt ive stat i st i cs 14 chapter • variance is a common measurement of dispersion. it averages the square deviations to come up with the variance of a data set. here is the one odd thing about variance: suppose you have 20 measurements, and they represent the entire population (for example, the 20 fish in an aquar-ium). in this case, you divide devsq by 20 to calculate the variance. you use var.p in excel to do this. however, if your 20 values are a random sample, then variance is calculated by dividing devsq by 20 – 1, or 19. you use var.s in excel to calculate this. • the measurement for variance is a square, right? you took all the deviations, squared them, and then averaged (or nearly averaged) them. the final popular measure of dispersion is calculated by taking the square root of the variance. this number is called standard deviation. excel offers two functions for standard deviation. you use stdev.p if your data set represents the entire population, and you use stdev.s if your data set represents only a sample of the population. theories about standard deviation there are many theories about standard deviation. one general rule states that 95% of a population will be located within two standard deviations of the mean. if you extend your range to within three standard deviations of the mean, that range should encompass 99.7% of the population. figure 14.11 shows the lengths of fish. column a contains the lengths of all 20 fish in one particular tank at a science museum. column e contains the lengths of 20 random fish observed while snorkel-ing at a coral reef. both groups have a mean value of 18.58 inches, as shown in cells c4 and g4. figure 14.11 although the averages are the same, the disper-sion measurements paint a different picture of these populations. the fish in the museum tank have an average deviation of 1.45 inches from the mean. cells c6, c8, and c10 walk through the calculation of squares of deviation, variance, and standard deviation. the theory about standard deviation says that of the fish in the tank, 95% will occur between 15.08 inches and 22.07 inches. the fish at the coral reef have an average deviation of 6.7 inches from the mean. cells g6, g7, and g9 walk through the calculation of squares of deviation, variance, and standard deviation. theus ing stat i st i cal funct ions 420 2 part theory about standard deviation says that of the fish at the coral reef, 95% will be between 0.78 inches and 36.37 inches long. comparing these two results helps you to picture the likely populations of both locations. although both have the same mean size, the variety of fish (that is, the measure of dispersion) at the coral reef is much higher than that at the aquarium. syntax: avedev( number1,number2,... ) the avedev function returns the average of the absolute deviations of data points from their mean. avedev is a measure of the variability in a data set. avedev is influenced by the unit of measure-ment in the input data. syntax: devsq( number1,number2,... ) the devsq function returns the sum of squares of deviations of data points from their sample mean. syntax: var.s( number1,number2,... ) the var.s function estimates variance based on a sample. syntax: var.p( number1,number2,... ) the var.p function calculates variance based on the entire population. syntax: stdev.s( number1,number2,... ) the stdev.s function estimates standard deviation based on a sample. the standard deviation is a measure of how widely values are dispersed from the average value (that is, the mean). the stan-dard deviation is calculated using the “nonbiased” or “n – 1” method. syntax: stdev.p( number1,number2,... ) the stdev.p function calculates standard deviation based on the entire population, given as argu-ments. the standard deviation is a measure of how widely values are dispersed from the average value (that is, the mean). stdev.p assumes that its arguments are the entire population. if your data caution logical values (true/false) are ignored in the stddev.s and stddev.p calculations. for some statistics, you need to figure out how many people answered true to a question. in order to count true values as 1 and false values as 0, you use vara, varpa, stdeva, and stdevpa versions of those four functions.421 examples of funct ions for regress ion and forecast ing 14 chapter represents a sample of the population, you can compute the standard deviation by using stdev.s. for large sample sizes, stdev.s and stdev.p return approximately equal values. the standard deviation is calculated using the “biased” or “n” method. in legacy versions of excel, var.s was simply known as var. var.p was known as varp. stdev.s was stdev. stdev.p was stdevp. if you are going to be sharing your workbook with people using legacy versions of excel, use the old names instead of the new names. the arguments number1, number2,... are 1 to 255 arguments for which you want the average of the absolute deviations. you can also use a single array or a reference to an array instead of argu-ments separated by commas. the arguments must be either numbers or names, arrays, or refer-ences that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. examples of functions for regression and forecasting regression analysis allows you to predict the future, based on past events. suppose you have observed total sales for the past several years. regression analysis finds a line that best fits the past data points. you can then use the description of that line to predict results for the future data points. regression works by finding a line that can best be drawn through existing data points. in real-life data, the data points aren’t arranged exactly in a line. any line that the computer draws will have errors at any data point. regression finds the line that minimizes the errors at each data point. consider the error in a regression line. the actual data point in year 1 might be higher than the regression line by 2. in year 2, the data might be lower by 1, and in year 3 it might be lower by 1. if you added up these three errors, you would have an error of 0. this is a bad method. if you used this method to judge a line with errors of 400, –300, –100, it would also add up to an error of 0. instead, the regression engine sums the square of each error. in this case, the first line would have an error of 22 –12 –12 or 4 1 1, or 6. the second line would have an error of 4002 –3002 –1002 or 160,000 90,000 10,000, or 260,000. with this method, the error for the first line is clearly better than the error for the second line. this method is called the least-squares method. you might wonder why regression doesn’t add the absolute value of each error. ideally, the errors around the regression line should be narrow. a line with errors of –4, 4, –4, 4 would result in a sum of squares of 64. a line with errors of –7, 1, 7, –1 would result in a sum of squares of 100. the sum of squares method would deem the earlier line to be better, whereas using absolute values would call them equal. considerations when using regression analysis you need to consider one question before doing regression analysis: is the data series growing lin-early or exponentially? sales for a company might grow linearly. the number of bacteria cells in a petri dish might grow exponentially. you use linest and trend to predict sales that are growing linearly. you use logest and growth to predict bacteria that are growing exponentially.us ing stat i st i cal funct ions 422 2 part in figure 14.12, the chart on the left shows sales over time. these sales are growing linearly and could probably be predicted fairly well by a straight line. the dotted line in the chart is the straight-line regression for the data set. although each data point is either above or below the regression line, the error at any given data point is fairly small. the chart on the right shows an exponential growth curve. in this chart, the dotted line shows the regression line plotted using logest. again, although the dotted line does not correlate exactly with the actual data points, it is fairly close. here is the problem: regression always finds a line to fit your data set. in figure 14.13, no apparent correlation exists between sales and time. each year, the sales fluctuate wildly up or down. if you asked excel to use regression, it would gladly predict the dotted line shown in the graph. the prob-lem is that this line has no predictive ability. if you base your future sales on this line, you will get results that will vary greatly from the prediction. part of the results of regression analysis are statistics that tell how well the regression line fits the actual data. you should always check statistics such as r-squared or the standard error to see if the past data shows a relationship between the variables. the r-squared value is a value between 0 and 1. the closer that r-squared is to 1, the better the regression line. the r-squared for the left chart in figure 14.12 is 0.985. the r-squared for the chart in figure 14.13 is 0.000001, indicating that no cor-relation exists. when you have data like the data in figure 14.13, it does not mean that you cannot use regression analysis. it means that you need to think about the data to see if other factors could help describe the data. suppose that the data represents sales of squares of roofing shingles in florida. if you add data to the chart that describes the number of category 3 hurricanes making landfall each year, figure 14.12 these two data sets can be accurately predicted using regression.423 examples of funct ions for regress ion and forecast ing 14 chapter the sales numbers begin to make sense. the r-squared for predicting sales based on year is nearly 0. the r-squared for predicting sales based on hurricanes is 0.987. because an r-squared of 1 means almost perfect correlation, you could base prediction of sales on a forecast of hurricanes. regression function arguments for all the following regression functions, the arguments list generally includes these two argu-ments (for brevity, they are described here once): • known_y’s—this is an array or a cell range of numeric dependent data points. this is the range of data that you want to predict. it might be the actual sales for the past several years or the population of bacteria for the past several hours. • known_x’s—this is the set of independent data points. these are the values that you think will lead to a prediction of the y values. for a simple time series, this might be a list of year numbers. it might be a list of other independent data points, such as the number of hurricanes making landfall each year. the arguments must be numbers or names, arrays, or references that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; how-ever, cells that contain the value 0 are included. if known_y’s and known_x’s are empty or have a different number of data points, the function returns a #n/a error. figure 14.13 this data set has no correlation to time. linest happily predicts a line, but it is severely wrong most of the time.us ing stat i st i cal funct ions 424 2 part functions for simple straight-line regression: slope and intercept with many things in excel, there is a right way to do something. however, sometimes the powers-that-be decide that the right way is too difficult for excel customers, so they offer alternative, easier ways to solve problems. the linest function is powerful, and using it is the right way to calculate straight-line regres-sion. however, because the linest function returns an array of values, it seemed too difficult, so microsoft also offers the slope and intercept functions to retrieve the key results from linest. in mathematical terms, a line is described as y mx b: • y—this is the value you are trying to predict. it could be sales for a given year. • b—this is called the y-intercept. this is the base level of sales that you can count on year after year. • m—this is the slope of the line. if your sales are going up by 1,000 per year, the slope is 1,000. if your sales are going up by 100,000 per year, the slope is 100,000. • x—this is a point along the x-axis. in a problem where you are measuring sales over a span of several years, you can assign year numbers 1, 2, 3, and so on to each year. x then corresponds to a year number. if you have a series of year numbers and sales for each year, you need to calculate both the slope and intercept to describe the line. syntax: slope( known_y’s , known_x’s ) the slope function returns the slope of the linear regression line through data points in known_y’s and known_x’s. the slope is the vertical distance divided by the horizontal distance between any two points on the line; in other words, it is the rate of change along the regression line. syntax: intercept( known_y’s , known_x’s ) the intercept function calculates the point at which a line intersects the y-axis by using existing x values and y values. the intercept point is based on a best-fit regression line plotted through the known x values and known y values. you use the intercept when you want to determine the value of the dependent variable when the independent variable is 0. in figure 14.14, the sales in b2:b11 are the dependent variables. in the language of excel, these are the known_y’s. you are predicting that sales are increasing linearly over time. the year numbers in a2:a11 are the independent variables. in the language of excel, these are the known_x’s. the formula in cell e2 calculates the intercept for the line by using intercept(b2:b11,a2:a11). the answer of 49,041 means that the model predicts that your sales in a hypothetical year 0 would have been 49,041.425 examples of funct ions for regress ion and forecast ing 14 chapter the formula in cell e3 calculates the slope of the line by using slope(b2:b11,a2:a11). the answer of 4,230 means that the model predicts that your sales are increasing by about 4,230 each year. when you have the slope and y-intercept, you can build a new table to predict future sales. you enter year numbers 11 through 15 in d8:d12. the formula in cell e8 needs to multiply the year num-ber by the slope and add the intercept. that formula is e2e3*d8. the values in cells e8 through e12 are one prediction of future sales. this assumes that the past trends continue to work over the next 5 years. using linest to calculate straight-line regression with complete statistics although slope and intercept would do the job, the more powerful function is linest. here is the difficulty: linest returns both the slope and the intercept. in addition, it returns a whole series of statistics. anytime a function returns several values, you must enter the function by using ctrlshiftenter. you should also select a large enough range in advance before entering the for-mula. figuring out the size of the range in advance is difficult because it varies, depending on the shape of the independent variables and also if you ask for statistics. however, linest is far more powerful than slope and intercept. additional arguments available in linest are not available in the easier functions. syntax: linest( known_y’s , known_x’s , const,stats ) the linest function calculates the statistics for a line by using the least-squares method to calculate a straight line that best fits the data, and it returns an array that describes the line. because this function returns an array of values, it must be entered as an array formula with ctrlshiftenter. the equation for the line is y mx b or y m1x1 m2x2 ... b (if there are multiple ranges of x values) where the dependent y value is a function of figure 14.14 using the slope and intercept functions is a simple way to calculate a linear regression line. note note that y, x, and m can be vectors. the array that linest returns is backward from what you would expect. the slope for the last independent variable appears first: {mn,mn-1,...,m1,b}. linest can also return addi-tional regression statistics.us ing stat i st i cal funct ions 426 2 part the independent x values. the m values are coefficients corresponding to each x value, and b is a constant value. the linest function takes the following arguments: • known_y’s—this is the set of y values you already know in the relationship y mx b. if the array known_y’s is in a single column, each column of known_x’s is interpreted as a separate variable. if the array known_y’s is in a single row, each row of known_x’s is interpreted as a separate variable. • known_x’s—this is an optional set of x values that you may already know in the relationship y mx b. the array known_x’s can include one or more sets of variables. if only one variable is used, known_y’s and known_x’s can be ranges of any shape, as long as they have equal dimensions. if more than one variable is used, known_y’s must be a vector (that is, a range with a height of one row or a width of one column). if known_x’s is omitted, it is assumed to be the array {1,2,3,...} that is the same size as known_y’s. • const—this is a logical value that specifies whether to force the constant b to equal 0. if const is true or omitted, b is calculated normally. if const is false, b is set equal to 0, and the m values are adjusted to fit y mx. • stats—this is a logical value that specifies whether to return additional regression statistics. if stats is true, linest returns the additional regression statistics, so the returned array is {mn,mn-1,...,m1,b;sen,sen1,...,se1,seb;r2,sey;f,df;ssreg,ssresid}. if stats is false or omitted, linest returns only the m coefficients and the constant b. if you specify true for stats, the addi-tional regression statistics shown in table 14.2 are possible return values. table 14.2 additional regression statistics for linest statistic description se1,se2,...,sen the standard errors for the coefficients m1,m2,...,mn. seb the standard error for the constant b (seb #n/a when const is false). r2 the coefficient of determination. you compare estimated and actual y values and ranges in value from 0 to 1. if it is 1, there is a perfect correlation in the sample—that is, no difference exists between the estimated y value and the actual y value. at the other extreme, if the coefficient of determination is 0, the regression equation is not helpful in predicting a y value. sey the standard error for the y estimate. f the f statistic, or the f observed value. you use the f statistic to deter-mine whether the observed relationship between the dependent and independent variables occurs by chance. df the degrees of freedom. you use the degrees of freedom to help you find f critical values in a statistical table. you compare the values you find in the table to the f statistic returned by linest to determine a confidence level for the model.427 examples of funct ions for regress ion and forecast ing 14 chapter statistic description ssreg the regression sum of squares. ssresid the residual sum of squares. figure 14.16, later in this chapter, shows a visual map of the statistics being returned. the accuracy of the line calculated by linest depends on the degree of scatter in the data. the more linear the data, the more accurate the linest model. linest uses the method of least squares for determining the best fit for the data. the line- and curve-fitting functions linest and logest can calculate the best straight line or exponential curve that fits the data. however, you have to decide which of the two results best fits the data. you can calculate trend(known_y’s,known_x’s) for a straight line or growth(known_y’s, known_x’s) for an exponential curve. these functions, without the known_x’s argument, return an array of y values predicted along that line or curve at your actual data points. you can then com-pare the predicted values with the actual values. you might want to chart them both for a visual comparison. in regression analysis, microsoft excel calculates for each point the squared difference between the y value estimated for that point and its actual y value. the sum of these squared differences is called the residual sum of squares. microsoft excel then calculates the sum of the squared differ-ences between the actual y values and the average of the y values, which is called the total sum of squares (that is regression sum of squares residual sum of squares). the smaller the residual sum of squares compared with the total sum of squares, the larger the value of the coefficient of determination, r-squared, which is an indicator of how well the equation resulting from the regres-sion analysis explains the relationship among the variables. case study: application of regression analysis suppose that you rent a snow-cone cart at a local amusement park. you create a table showing total snow cones sold for each day of last summer. in figure 14.15, column e shows the total snow cones sold by day. as you can see, the sales rise and fall sharply from day to day. figure 14.15 the results of the linest function in g4:j8 are seem-ingly meaningless. the previous manager of the cart had noticed certain trends in the data. sales were better on the weekends than on weekdays. sales were horrible when it rained. sales improved as the weather became hotter in july and august.us ing stat i st i cal funct ions 428 2 part columns b:d in figure 14.15 contain data related to temperature, weekends, and rain. note that in column c, the weekend data is binary data—either 0 or 1. in column d, the manager could have kept information about the amount of rainfall each day but instead kept this as binary data as well. if the day was predominantly rainy, the manager recorded a 1 to indicate a rainout. if the day had just a spot of rain, the manager recorded it as a nonrainy day. to perform regression on this data, follow these steps: 1. total the number of independent variables and add one. this is the number of columns the results of the regression will occupy. in the snow cone cart example, that is four columns. 2. figure out how may rows the result of the regression will occupy. because you plan to ask for statistics in the snow cone example, this is five rows. 3. off to the side of the data, select a range that is four columns wide by five columns tall. this size is determined by the results of the first two steps. 4. start to type the formula, linest (. 5. for the known_y’s, use the sales data in column e; this is e4:e95. 6. for the known_x’s, use the values for temperature, weekend, and rain. this is b4:d95. note that the dates in column a are not being used as an independent variable. the amusement park is an established park, and there is nothing to indicate that attendance rises over the course of the season. 7. use true for the next argument, which asks whether the intercept should be forced to be 0. this is not a requirement in the current situation. you want to allow the intercept to be calculated normally. 8. use 1 or true for the stats argument. figure 14.16 when you have the linest results, you can perform many more tests and charts that test how good the regression model is.429 examples of funct ions for regress ion and forecast ing 14 chapter 9. although you have now typed the complete formula, linest(e4:e95,b4:d95,true,true), do not press the enter key. this is one formula that returns many results. you have to tell excel to interpret the formula as an array formula. to do this, hold down ctrlshift while pressing enter. the function returns a seemingly meaningless range of numbers, as shown in figure 14.15. 10. start labeling the regression results in the upper-right corner. the value in the upper-right corner is the y-intercept. this is equivalent to the result of the intercept function. 11. working in the top row from right to left, look at the slopes of the independent variables. these appear backward from how you originally specified them. your independent variables were tem-perature, weekend, and rain. the slope for the last independent variable is in the top-left corner of the results. in figure 14.16, cell g4 is the slope associated with rain. cell h4 is the slope asso-ciated with weekend. cell h5 is the slope associated with temperature. 12. take a look at these numbers for a second to see if they make sense. the intercept says you are going to sell –75 snow cones each day. this initially seems wrong. however, the value in column i says that you will sell 2.6 snow cones for every degree of temperature. because the lowest minimum high temperature for the summer would be about 60 degrees, the result sug-gests that you would sell a minimum of (60 2.6), or about 156 snow cones, due to temperature. adding the –75 and 156 gets you to a minimum of 80 snow cones on a sunny day. cell h4 sug-gests that you would sell about 52 extra snow cones on a weekend. cell g4 suggests that you would sell 102 fewer snow cones on a rainy day. 13. fill in the rest of the labels for statistics. the second row of the results shows the standard error for the number above it. the first column of the third row returns the all-important r-squared value. if this value is close to 1, your model is doing a good job of predicting the data. the value of 0.95 shows that this model is fairly good. row 3, column 2 shows the standard error of y. it is normal to have #n/a in any additional columns of row 3. row 4 contains the f statistic and degrees of free-dom. row 5 contains the sum of squares of the regression and the residual sum of squares. this is the number that excel is trying to minimize when it fits the line using least squares. 14. in column l, build a formula to predict sales with the results of the regression. this formula would be intercept slope temp temp slope weekend weekend slope rain rain. the formula in cell l4 is therefore j4i4*b4h4*c4g4*d4. 15. to visually compare the data, plot the actuals in column e and the prediction in column l on a chart. the chart in rows 12:22 shows that the prediction is tracking fairly well with the actual. there was a cold, rainy weekday near the beginning where the model predicted –10 sales versus an actual of 25. 16. for another interesting test, calculate the residual or error for each day. the data in column m is the difference between l & e. plot this data. you should see many small positive and negative values. (notice that the scale of this chart is smaller than the original chart.) the values should swing from positive to negative frequently. the amount of scatter should not vary over time. you should not see many clusters of points that are either positive or negative. the chart in rows 24:34 shows that there are many positive residuals early in the summer, and fewer later in the summer. this might mean that the model is less successful at lower june temperatures than at higher august temperatures. perhaps only real snow-cone fans buy the product at temperatures of 60 to 80. above 80 degrees, more people might buy the product.us ing stat i st i cal funct ions 430 2 part troubleshooting linest remember that linest returns an array of values. in addition, you need to select a large enough range before entering the function, and you need to use ctrlshiftenter to enter the formula. if you forget to use ctrlshiftenter, excel returns just the top-left cell from the resultset. in the data set in figure 14.15, this would be the slope for the final independent variable (–102.236). if you enter linest and receive just one value, you should follow these steps: 1. select a range starting with the linest formula in the upper-left corner. the range should be five rows tall. it should be at least two columns wide for models with one known_x column. add additional columns for additional known_x series. 2. press the f2 key to edit the current linest formula. 3. hold down ctrlshiftenter to reenter the formula as an array. alternatively, you can use the index function to pluck one particular value out of the linest func-tion. for example, if you wanted to retrieve the f statistic from row 4, column 1, you could use index(linest(e4:e95,b4:d95,true,true),4,1). in the simpler situation when you have only one independent x variable, you can obtain the slope and y-intercept values directly by using the following formula for slope: index(linest(known_y’s,known_x’s),1) use the following formula for the y-intercept: index(linest(known_y’s,known_x’s),2) using forecast to calculate prediction for any one data point when you understand straight-line regression, you can use the forecast function to return a pre-diction for any point in the future. syntax: forecast( x,known_y’s , known_x’s ) the forecast function calculates, or predicts, a future value by using existing values. the predicted value is a y value for a given x value. the known values are existing x values and y values, and the new value is predicted by using linear regression. you can use this function to predict future sales, inventory require-ments, or consumer trends. the forecast function takes the following arguments: • x—this is the data point for which you want to predict a value. if x is nonnumeric, forecast returns a #value! error. • known_y’s—this is the dependent array or range of data. • known_x’s—this is the independent array or range of data. note note that forecast works only for straight-line regression. it also does not offer the capabil-ity to force the intercept to be 0. if you need this capability, you have to use linest and then build a prediction formula as in step 14 of the previous section or the trend function as discussed in the next section.431 examples of funct ions for regress ion and forecast ing 14 chapter if known_y’s and known_x’s are empty or contain a different number of data points, forecast returns an #n/a error. if the variance of known_x’s equals 0, then forecast returns a #div/0! error. figure 14.17 shows actual sales data for the past decade. years are in column a, and sales are in column c. the sales data in c2:c12 is the range of known_y’s. the years in a2:a12 is the range of known_x’s. to predict sales for future periods, follow these steps: 1. enter future years in a13:a217. 2. in column b, enter actual or forecast for each row so that the person reading the table under-stands that the new values are a forecast. 3. to predict sales for 2011, enter this formula in cell c13: forecast(a13,c2:c12,a2 :a12). 4. copy the formula from cell c13 down to c14:c17. figure 14.17 you use the forecast function to find the data point for one future time period. using trend to calculate many future data points at once the trend function is another array function. this means that it can return many values from a sin-gle formula. if you think about the previous use of forecast in figure 14.17, you realize that excel really had to perform the linear regression multiple times—once for each of the cells in c13:c17. it would be better if you could perform the regression once and have excel calculate all the values from that regression. the trend function helps you do this. syntax: trend( known_y’s , known_x’s , new_x’s , const )us ing stat i st i cal funct ions 432 2 part the trend function returns values along a linear trend. it fits a straight line (using the least-squares method) to the arrays known_y’s and known_x’s. it returns the y values along that line for the array of new_x’s that you specify. the trend function takes the following arguments: • known_y’s—this is the set of y values you already know in the relationship y mx b. if the array known_y’s is in a single column, each column of known_x’s is interpreted as a separate variable. if the array known_y’s is in a single row, each row of known_x’s is interpreted as a separate variable. • known_x’s—this is an optional set of x values that you may already know in the relationship y mx b. the array known_x’s can include one or more sets of variables. if only one variable is used, known_y’s and known_x’s can be ranges of any shape, as long as they have equal dimensions. if more than one variable is used, known_y’s must be a vector (that is, a range with a height of one row or a width of one column). if known_x’s is omitted, it is assumed to be the array {1,2,3,...} that is the same size as known_y’s. • new_x’s—these are new x values for which you want trend to return corresponding y values. new_x’s must include a column (or row) for each independent variable, just as known_x’s does. so, if known_y’s is in a single column, known_x’s and new_x’s must have the same number of columns. if known_y’s is in a single row, known_x’s and new_x’s must have the same num-ber of rows. if you omit new_x’s, it is assumed to be the same as known_x’s. if you omit both known_x’s and new_x’s, they are assumed to be the array {1,2,3,...} that is the same size as known_y’s. • const—this is a logical value that specifies whether to force the constant b to equal 0. if const is true or omitted, b is calculated normally. if const is false, b is set equal to 0, and the m values are adjusted so that y mx. case study: forecasting using regression analysis suppose that you are responsible for forecasting the material needs for a company that supplies roofing material. you have historical trends of usage by year. you’ve included past hurricane and recession data because those events caused extraordinary demand. your job is to predict how much roofing material you will sell, assuming that there are no hurricanes, but how much you might want to have lined up in case there are one, two, or three hurricanes. here’s what you do: 1. as in the worksheet shown in figure 14.18, enter the actual data in a4:d23. make the sales in column d the known_y’s. 2. make the years, recession and hurricane data in columns a:c the known_x’s. 3. enter a new table in a26:c33. you want to find the forecasted requirements for 2010 and 2011 for the possibility that there are zero, one, two, or three hurricanes. the year, recession, and hur-ricane columns must be in the same format as the known_x’s in step 2. 4. keep in mind that because the trend function is an array function, it can return several answers from one formula. select the range d26:d33. with that range selected, start to type the formula trend(.433 examples of funct ions for regress ion and forecast ing 14 chapter 5. enter d5:d23 for known_y’s, which are past sales. enter a5:c23 for known_x’s. the new x val-ues are the data in a26:c33. 6. ensure that your formula is now trend(d5:d23,a5:c23,a26:c33). to finish the formula, hold down ctrlshift while pressing enter. figure 14.18 the trend function is an array formula that can do one regression and return many future data points. the result is shown in d26:d33. the trend function predicts that you will need a base level of 115,000 in 2010 with no hurricanes. with two hurricanes in 2010, demand would rise to 174,000. using logest to perform exponential regression some patterns in business follow a linear regression. however, other items are not linear at all. if you are a scientist monitoring the growth of bacteria in a petri dish, you will see exponential growth in the generations. if you try to fit an exponential growth to a straight line, you have a large error. if the r-squared from linear regression is too low, you can try using exponential regression to see if the pattern of data matches exponential regression better. for exponential regression, you use the logest function, which is similar to the linest function. syntax: logest( known_y’s , known_x’s , const , stats )us ing stat i st i cal funct ions 434 2 part in regression analysis, the logest function calculates an exponential curve that fits the data and returns an array of values that describes the curve. because this function returns an array of values, it must be entered as an array formula. the equation for the curve is y b*mx or y (b*(m1x1)*(m2x2)*_) (if there are multiple x values), where the dependent y value is a function of the independent x values. the m values are bases that correspond to each exponent x value, and b is a constant value. the logest function takes the following arguments: • known_y’s—this is the set of y values you already know in the relationship y b mx. if the array known_y’s is in a single column, each column of known_x’s is interpreted as a separate variable. if the array known_y’s is in a single row, each row of known_x’s is interpreted as a separate variable. • known_x’s—this is an optional set of x values you may already know in the relationship y b mx. the array known_x’s can include one or more sets of variables. if only one variable is used, known_y’s and known_x’s can be ranges of any shape, as long as they have equal dimen-sions. if more than one variable is used, known_y’s must be a range of cells with a height of one row or a width of one column (which is also known as a vector). if known_x’s is omitted, it is assumed to be the array {1,2,3,...} that is the same size as known_y’s. • const—this is a logical value that specifies whether to force the constant b to equal 1. if const is true or omitted, b is calculated normally. if const is false, b is set equal to 1, and the m val-ues are fitted to y mx. • stats—this is a logical value that specifies whether to return additional regression sta-tistics. if stats is true, logest returns the additional regression statistics (refer to figure 14.16), so the returned array is {mn,mn-1,...,m1,b;sen,sen-1,...,se1,seb;r2,sey; f,df;ssreg,ssresid}. if stats is false or omitted, logest returns only the m coefficients and the constant b. the more a plot of data resembles an exponential curve, the better the calculated line fits the data. like linest, logest returns an array of values that describes a relationship among the values, but linest fits a straight line to the data; logest fits an exponential curve. performing an exponential regression figure 14.19 shows an estimated population in column b and the generation in column a. to per-form an exponential regression, follow these steps: 1. because there is one independent variable, the results from the regression occupy two columns, so find a blank range of the spreadsheet and select a range that is two columns wide by five rows tall, such as e2:f6. 2. enter the beginning of the formula: logest(. enter the known_y’s as b2:b9 and the known_x’s as a2:a9. leave the const value blank. specify true for statistics. the formula should be logest(b2:b9,a2:a9,,true). 3. do not press enter for the formula. instead, hold down ctrlshift while pressing enter to tell excel to interpret the result as an array formula and to return a table of values from logest.435 examples of funct ions for regress ion and forecast ing 14 chapter 4. add some labels to help interpret the statistics. the labels shown in column d and g are examples. 5. to use the results of the regression in a prediction calculation, enter a different formula than with linest. the formula is intercept slopex. in figure 14.19, to predict population values for a given generation in cell i2, use f2*e2i2. alternatively, you can use the growth func-tion, discussed in the next section. figure 14.19 when data is grow-ing at an exponen-tial rate, you use logest to per-form a regression analysis. using growth to predict many data points from an exponential regression as the trend function is able to extrapolate points from a linear regression, the growth function is able to extrapolate points from an exponential regression. syntax: growth( known_y’s , known_x’s , new_x’s , const ) the growth function calculates predicted exponential growth by using existing data. growth returns the y values for a series of new x values that you specify by using existing x values and y values. you can also use the growth worksheet function to fit an exponential curve to existing x val-ues and y values. this function takes the following arguments:us ing stat i st i cal funct ions 436 2 part • known_y’s—this is the set of y values you already know in the relationship y b mx. if the array known_y’s is in a single column, each column of known_x’s is interpreted as a separate variable. if the array known_y’s is in a single row, each row of known_x’s is interpreted as a sep-arate variable. if any of the numbers in known_y’s is 0 or negative, growth returns a #num! error. • known_x’s—this is an optional set of x values that you may already know in the relationship y b mx. the array known_x’s can include one or more sets of variables. if only one vari-able is used, known_y’s and known_x’s can be ranges of any shape, as long as they have equal dimensions. if more than one variable is used, known_y’s must be a vector (that is, a range with a height of one row or a width of one column). if known_x’s is omitted, it is assumed to be the array {1,2,3,...} that is the same size as known_y’s. • new_x’s—these are new x values for which you want growth to return corresponding y values. new_x’s must include a column (or row) for each independent variable, just as known_x’s does. so, if known_y’s is in a single column, known_x’s and new_x’s must have the same number of columns. if known_y’s is in a single row, known_x’s and new_x’s must have the same number of rows. if new_x’s is omitted, it is assumed to be the same as known_x’s. if both known_x’s and new_x’s are omitted, they are assumed to be the array {1,2,3,...} that is the same size as known_y’s. • const—this is a logical value that specifies whether to force the constant b to equal 1. if const is true or omitted, b is calculated normally. if const is false, b is set equal to 1, and the m val-ues are adjusted so that y mx. in figure 14.20, the original data is the population for the first 10 generations in a2:b11. tip when you have formulas that return arrays, you must enter them as array formulas after selecting the correct number of cells. to specify an array for-mula, you hold down ctrlshift while pressing enter. figure 14.20 growth performs an expo-nential regression and extrapolates the results in one step.437 examples of funct ions for regress ion and forecast ing 14 chapter exponential regression used to predict future generations it would be interesting to run an exponential regression and see the prediction for future genera-tions but also for the known generations. this would allow you to see how well the prediction tracks with current values. to do this, follow these steps: 1. add new generation numbers in a12:a19. the growth function will use these numbers and return an array of values. 2. select the entire range c2:c19 for the results before entering the formula. 3. put the known_y’s in b2:b11. the known_x’s are in a2:a11. put the new_x’s in a2:a19. the formula is growth (b2:b11,a 2:a11,a2:a19). 4. after typing the formula, hold down ctrlshift while pressing enter. this should cause the formula to return values in each cell in c2:c19. 5. to visualize the original data and the prediction, plot a1:c19 on a line chart. numbers at the end of the progression (24 million) make the scale of the chart so large that you cannot see the detail of the first 12 generations. 6. right-click the numbers along the y-axis and select format axis. on the scale tab, select logarithmic scale. the resulting chart allows you to examine both the smaller and larger numbers in the chart. using pearson to determine whether a linear relationship exists remember that excel blindly fits a regression line to any data set. the fact that excel returns a regression line does not mean that you should use it to make any predictions. the initial question to ask yourself is, does a linear relationship exist in this data? the pearson product–moment correlation coefficient, named after karl pearson, returns a value from –1.0 to 1.0. the calculation could make your head spin, but the important thing to know is that a pearson value closer to 1 or –1 means that a linear relationship exists. a value of 0 indicates no cor-relation between the independent and dependent variables. syntax: pearson( array1 , array2 ) the pearson function returns the pearson product–moment correlation coefficient, r, a dimension-less index that ranges from –1.0 to 1.0, inclusive, and reflects the extent of a linear relationship between two data sets. note i am somewhat jealous that microsoft has named an obscure function after fellow excel con-sultant chip pearson. i am lob-bying microsoft for the inclusion of a jelen function, possibly used to measure the degree of laid-backness caused by the gel in your shoe insoles. seriously, chip pearson’s website is one of the best established sources of articles on the web about excel. to peruse the articles, visit www. cpearson.com .us ing stat i st i cal funct ions 438 2 part the pearson function takes the following arguments: • array1—this is a set of independent values. • array2—this is a set of dependent values. the arguments must be either numbers or names, array constants, or references that contain num-bers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if array1 and array2 are empty or have a different number of data points, pearson returns an #n/a error. the result of pearson is also sometimes known as r. multiplying pearson by itself leads to the more famous r-squared test. using rsq to determine the strength of a linear relationship r-squared is a popular measure of how well a regression line explains the variability in the y values. it is popular because the values range from 0 to 1. numbers close to 1 mean that the regression line does a great job of predicting the values. numbers close to 0 mean that the regression result can’t predict the values at all. r-squared is the statistic in the third row, first column of a linest function. it is also the square of the pearson function. you could use index( linest (),3,1) or pearson()2. but instead, excel provides the easy-to-remember rsq function. syntax: rsq( known_y’s,known_x’s ) the rsq function returns the square of the pearson product–moment correlation coefficient through data points in known_y’s and known_x’s. the r-squared value can be interpreted as the proportion of the variance in y that is attributable to the variance in x. for more information on the pearson coefficient, see the section on the pearson function, ear-lier in this chapter. the rsq function takes the following arguments: • known_y’s—this is an array or a range of data points. • known_x’s—this is an array or a range of data points. the arguments must be either numbers or names, arrays, or references that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if known_y’s and known_x’s are empty or have a different number of data points, rsq returns an #n/a error. figure 14.21 shows four data sets and their associated r-squared values: • the chart in the top-left corner has an r-squared near 0. there is little predictive ability in this regression line. in fact, the regression line is practically a horizontal line drawn through the mean of the data points.439 examples of funct ions for regress ion and forecast ing 14 chapter • the chart in the lower-left corner has an r-squared of 0.48. there is a lot of variability in the dots, but they do seem to trend up. there are huge relative errors on certain data points (for example, the value of y 1 when x 7). • the chart in the upper-right corner shows a nearly perfect correlation. the r-squared is appropri-ately high, at 0.988. this means that most of the variability in y is explained by x. there are some tiny minor variations above or below the line, but the regression is doing a great job. • the final chart, in the lower right, illustrates a perfect correlation and an r-squared of 1.0. every occurrence of y falls exactly on the regression line. using steyx to calculate standard regression error standard error is a measure of the quality of a regression line. in rough terms, the standard error is the size of an error that you might encounter for any particular point on the line. smaller errors are better, and larger errors are worse. standard error can also be used to calculate a confidence interval for any point. syntax: steyx( known_y’s , known_x’s ) the steyx function returns the standard error of the predicted y value for each x in the regression. the standard error is a measure of the amount of error in the prediction of y for an individual x. the steyx function takes the following arguments: • known_y’s—this is an array or a range of dependent data points. • known_x’s—this is an array or a range of independent data points. figure 14.21 as r-squared approaches 1.0, the predictive ability of the regression line improves.us ing stat i st i cal funct ions 440 2 part the arguments must be either numbers or names, arrays, or references that contain numbers. if an array or a reference argu-ment contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if known_y’s and known_x’s are empty or have a different number of data points, steyx returns an #n/a error. to calculate standard error, you square all the residuals and add them together. then you divide by the number of points, exclud-ing the starting and ending points. finally, you take the square root of that result to calculate stan-dard error. in general, a lower standard error is better than a higher one. a standard error of 2,000 when you are trying to predict the price of a 30,000 car isn’t too bad. a standard error of 2,000 when you are trying to predict the price of a 3 jar of pickles is horrible. you need to compare the standard error to the size of the value you are predicting. in figure 14.22, two regressions attempt to predict the price of a car based on either mileage or age. the standard error for the mileage method is a little less than the standard error for the age method. note covariance.p is the new name for the old covar function. covar is included in excel 2010 for backward compatibility. figure 14.22 standard error is another measure of the quality of a regression line. using covariance.p to determine whether two variables vary together covariance is a measure of how greatly two variables vary together. if the value is 0, the variables do not appear to be related. for positive values, covariance indicates that as x increases, y also increases. for negative values, covariance indicates that as x increases, y decreases. syntax: covariance.p( array1 , array2 ) covariance.s( array1 , array2 ) the covariance.p function returns covariance, the average of the products of deviations for each data point pair. you use covariance to determine the relationship between two data sets. for example, you can examine whether greater income accompanies greater levels of education. covariance.p assumes that the arrays represent an entire population. if the arrays are a sample of the population, use the new covariance.s function.441 examples of funct ions for regress ion and forecast ing 14 chapter the covariance.p and covariance.s functions take the following arguments: • array1—this is the first cell range of integers. • array2—this is the second cell range of integers. the arguments must be either numbers or names, arrays, or references that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if array1 and array2 have different numbers of data points, covariance.p returns an #n/a error. if either array1 or array2 is empty, covariance.p returns a #div/0! error. covariances can become incredibly large. the unit of measurement is on the order of x times y. for a dimensionless measurement of correlation, you use correl instead of covariance.p. in figure 14.23, the correl function measures the covariance between mileage and price. as mile-age increases, price decreases. using correl to calculate positive or negative correlation instead of using covariance, you can calculate a correlation coefficient for two arrays. let’s use the mileage and price comparison from figure 14.23. the two values would have a strong positive cor-relation if price went up as mileage went up. a perfect positive correlation would result in a correla-tion coefficient of 1.0. figure 14.23 covarariance.p shows that price and mileage are inversely correlated. it is also possible (as in the mileage–price comparison case) for values to have an inverse correla-tion. as mileage increases, the price tends to decrease. if mileage were the only factor in the price of a car, the correlation coefficient would be –1.0 to indicate a perfect inverse correlation. a correlation coefficient of 0 indicates that there is no correlation between the values.us ing stat i st i cal funct ions 442 2 part syntax: correl( array1 , array2 ) the correl function returns the correlation coefficient of the array1 and array2 cell ranges. you use the correlation coefficient to determine the relationship between two properties. for example, you can examine the relationship between a location’s average temperature and the use of air conditioners. the correl function takes the following arguments: • array1—this is a cell range of values. • array2—this is a second cell range of values. the arguments must be numbers or names, arrays, or references that contain numbers. if an array or reference argument contains text, logical values, or empty cells, those values are ignored; how-ever, cells that contain the value 0 are included. if array1 and array2 have a different number of data points, correl returns an #n/a error. if either array1 or array2 is empty, or if s (the standard deviation) of their values equals 0, correl returns a #div/0! error. in figure 14.24, price and mileage have a correlation coefficient of -0.87. this indicates a fairly strong inverse correlation. as mileage increases, price decreases. the bottom-left chart shows two series with no correlation at all; the correlation coefficient is very close to 0. the bottom-right chart shows two series with perfect positive correlation of 1.0. figure 14.24 the correl function returns values from –1.0 to 1.0. values near 0 indicate no correlation.443 examples of funct ions for regress ion and forecast ing 14 chapter using fisher to perform hypothesis testing on correlations the pearson value does not have a normal distribution. the graph of expected r values skews heav-ily toward 1. a statistician named fisher found a formula that would transform the skewed r value into a normal distribution. you use the fisher function to convert an r value. to take a fisher value and return it to an r value, you use fisherinv. syntax: fisher(x) the fisher function returns the fisher transformation at x. this transformation produces a func-tion that is approximately normally distributed rather than skewed. you use this function to perform hypothesis testing on the correlation coefficient. the argument x is a numeric value for which you want the transformation. if x is nonnumeric, fisher returns a #value! error. if x is less than or equal to –1 or if x is greater than or equal to 1, fisher returns a #num! error. syntax: fisherinv( y ) the fisherinv function returns the inverse of the fisher transformation. you use this transforma-tion when analyzing correlations between ranges or arrays of data. if y is equal to fisher(x), then fisherinv(y) is equal to x. the argument y is the value for which you want to perform the inverse of the transformation. if y is nonnumeric, fisherinv returns a #value! error. using skew and kurtosis two final statistics are used to describe a population: • skew—skew is an indicator of symmetry. actually, it is a measure of lack of symmetry. a skew value of 0 indicates that the population is perfectly symmetrical around the mean. negative val-ues indicate that the data is skewed to the left of the mean. positive values indicate that the data is skewed to the right of the mean. you can use excel’s skew function to calculate skew. • kurtosis—kurtosis indicates whether the distribution contains a spiky peak or is relatively flat. this measure compares a population to the standard normal distribution. if the kurtosis is less than 0, the population is flatter than the normal distribution. if the kurtosis is greater than 0, the popula-tion is spikier than the normal distribution. you use excel’s kurt function to calculate kurtosis. in figure 14.25, there are two populations. the population in column a contains one large spike of 19 data points at 2.36 inches and a single data point at 60.25 inches. you can think of this as a tank with 1 shark and 19 goldfish. the average size is 5.25 inches. the tail of the distribution is a very long tail to the right of the 5.25 inches mean, indicating a positive skew. the 19 goldfish cause a very spiky data point, causing a high kurtosis.us ing stat i st i cal funct ions 444 2 part in column e, the data points are uniformly distributed around the mean. the data is perfectly sym-metrical, leading to a skew of 0.00. no data point has more than one member, causing the data to be extremely flat, with a negative kurtosis. syntax: skew( number1,number2,... ) the skew function returns the skewness of a distribution. skewness characterizes the degree of asymmetry of a distribution around its mean. positive skewness indicates a distribution with an asymmetric tail extending toward more positive values. negative skewness indicates a distribution with an asymmetric tail extending toward more negative values. the arguments number1,number2... are 1 to 255 arguments for which you want to calculate skew-ness. you can also use a single array or a reference to an array instead of arguments separated by commas. the arguments must be either numbers or names, arrays, or references that contain num-bers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if there are fewer than three data points, or if the sample standard deviation is 0, skew returns a #div/0! error. syntax: kurt( number1,number2,... ) the kurt function returns the kurtosis of a data set. kurtosis characterizes the relative peakedness or flatness of a distribution compared with the normal distribution. positive kurtosis indicates a rela-tively peaked distribution. negative kurtosis indicates a relatively flat distribution. figure 14.25 skew and kurtosis return information about the sym-metry and spikiness of a data set.445 examples of funct ions for inferent ial stat i st i cs 14 chapter the arguments number1,number2,... are 1 to 255 arguments for which you want to calculate kur-tosis. you can also use a single array or a reference to an array instead of arguments separated by commas. the arguments must be either numbers or names, arrays, or references that contain num-bers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if there are fewer than four data points, or if the standard deviation of the sample equals zero, kurt returns a #div/0! error. examples of functions for inferential statistics inferential statistics is the powerful side of statistics. with descriptive statistics, you can describe a data set. describing a data set might allow you to understand the data set better. with regression, you use past trends to predict future results. with inferential statistics, you extrapolate information about a sample of the population to make predictions about the entire population. understanding the language of inferential statistics excel 2010 offers functions for 14 different types of probability distributions. each distribution pre-dicts a different shape of the population. you’ll read about the various distributions in the following sections, but for now, consider the distribution in figure 14.26. figure 14.26 the point density function describes the probability that a member of a population has one specific value. if you are asked to predict the likelihood that a member of a population has a value of 6, you can look at figure 14.26 and see that the probability is just over 10%. this value is called the point density function, or pdf. figure 14.27 the cumulative density function describes the probability that a member of a population is less than one specific value.us ing stat i st i cal funct ions 446 2 part a different question would be to estimate what percentage of the population has a value of 6 or less? this is known as the cumulative density function (cdf). figure 14.27 graphs the cdf. to cal-culate the cdf without excel, you would have to use a little calculus to integrate from 0 to 6 over the distribution function. thankfully, with excel, you do not have use calculus! all 14 distribution functions in excel calculate both the pdf and cdf. look for an argument in each function called cumulative. if you specify true for cumulative, you are getting the cdf from figure 14.27. if you specify false, you are getting the pdf function from figure 14.26. when you see a function that includes .dist, this function is used to calculate either the pdf or the cdf, depending on whether you specify cumulative false or cumulative true. here is a simple math quiz: if the .dist function predicts that there is a 30% chance the value is 6 or less, what is the prediction that the value is more than 6? although you would have needed calculus to figure out the 30% answer, after you know the 30% answer, you don’t even need a cal-culator to know that the probability of the answer being more than 6 is 70%. if the area in figure 14.27 is 30%, the area in figure 14.28 is going to be 100% - 30%, or 70%. that is called the right-tailed cdf. you can calculate the right-tailed cdf by subtracting the left-tailed cdf from 100%. but, for a matter of convenience, excel offers .rt versions of chi-squared, f, and t distributions. figure 14.28 three distributions in excel include a function to calculate the right-tailed cdf. some distributions are symmetric around a mean. with these distributions, you might ask about the probability of a member of the population being with x% of the mean. the opposite question is to find the probability that a member will fall outside of that area. this is known as a two-tailed cdf and is shown in the bottom chart in figure 14.29. excel offers a 2t version of the student’s t distribution. the real-life applications of inferential statistics usually involve a different type of question. the manager of a bank wants to make sure that the staffing levels allow there to be no wait 80% of the time. whereas the dist function can tell you the probability for a certain value, the inv func-tion does the opposite. it tells you the value at a certain probability. the inv functions are always cumulative. of the 14 types of distributions, 9 of them offer an inverse function. to recap, when you see an inferential statistics function in excel, it will start with the name of the distribution and be followed by one or more suffixes: • .dist indicates that you can calculate the pdf or cdf. • .rt indicates it will calculate the right-tailed cdf.447 examples of funct ions for inferent ial stat i st i cs 14 chapter • .2t indicates it will calculate the two-tailed cdf. • .inv indicates that it will find the value at which the cdf meets a certain probability. figure 14.30 shows a matrix of the distribution functions in excel 2010. figure 14.29 the area shown in the lower chart is a two-tailed cdf. t.dist.2t is the only function to calculate two-tailed probability. figure 14.30 functions for inferential statistics in excel 2010.us ing stat i st i cal funct ions 448 2 part using binom.dist to determine probability a binomial test is a situation in which there are only two possible outcomes: either an event happens or it does not happen. for example, suppose you have determined that on several nights of the week, someone has been sneaking in and eating leftovers from the department fridge. you don’t know if it is the night secu-rity guard or the cleaning crew or even just charley, who works later than everyone else. after tracking this behavior for a month, you determine that food has been missing 27% of the time. how many days next week will food be missing? the binom.dist function can answer this question. syntax: binom.dist( number_s,trials, probability_s,cumulative ) the binom.dist function returns the individual term binomial distribution probability. you use binom.dist in problems with a fixed number of tests or trials, when the outcomes of any trial are only success or failure, when trials are independent, and when the probability of success is constant throughout the experiment. for example, binom.dist can calculate the probability that two of the next three babies born will be male. the binom.dist function takes the following arguments: • number_s—this is the number of successes in trials. • trials—this is the number of independent trials. • probability_s—this is the probability of success on each trial. • cumulative—this is a logical value that determines the form of the function. if cumulative is true, then binom.dist returns the cumulative distribution function, which is the probability that there are at most number_s successes; if cumulative is false, binom.dist returns the probability mass function, which is the probability that there are number_s successes. number_s and trials are truncated to integers. if number_s, trials, or probability_s is non-numeric, binom.dist returns a #value! error. if number_s is less than 0 or number_s is greater than trials, binomdist returns a #num! error. if probability_s is less than 0 or probability_s is greater than 1, binom.dist returns a #num! error. in figure 14.31, range b5:b10 calculates the probability that food will be missing x days next week. in each case, trials is 5 because there are five workdays next week. the probability_s is 0.27. cell b15 calculates the cumulative probability that 0 or 1 successes will be encountered next week. caution 100% of the function names in figure 14.30 are new in excel 2010. none of those function names are backward compatible with excel 2007 or earlier. tip the binom.dist always calcu-lates the probability starting from the left side of the curve. if you wanted to calculate the probability of three or more suc-cesses next week, you would have to use one minus the prob-ability of two or fewer successes, as shown in cell b22.449 examples of funct ions for inferent ial stat i st i cs 14 chapter using binom.inv to cover most of the possible binomial events many tests are binomial, as described in the preceding section. suppose you are exhibiting at a trade show. you expect 2,000 attendees at the trade show. based on data from past trade shows, you predict that there is a 17% chance that an attendee will visit your booth and take a catalog. your goal is to have enough catalogs so that you will be 95% sure to have enough catalogs for everyone. you can use the binom.inv function to predict how many catalogs you need. syntax: binom.inv( trials,probability_s,alpha ) the binom.inv function returns the smallest value for which the cumulative binomial distribution is greater than or equal to a criterion value. you use this function for quality assurance applications. for example, you can use binom.inv to determine the greatest number of defective parts you can allow to come off an assembly line run without needing to reject the entire lot. the binom.inv function takes the following arguments: figure 14.31 for tests that are either true or false, the binom.dist function can calculate the probability of events. note binom.dist is a new name in excel 2010. for compatibility with legacy versions of excel, use binomdist instead.us ing stat i st i cal funct ions 450 2 part • trials—this is the number of bernoulli trials. • probability_s—this is the probability of a success on each trial. • alpha—this is the criterion value. if any argument is nonnumeric, binom.inv returns a #value! error. if trials is not an integer, it is truncated. if trials is less than 0, binom.inv returns a #num! error. if probability_s is less than 0 or if probability_s is greater than 1, binom.inv returns a #num! error. if alpha is less than 0 or if alpha is greater than 1, binom.inv returns a #num! error. in the trade show example, the number of trials is 2,000: each attendee has a chance of picking up a catalog. the probability_s is 17%, and alpha is 0.95, although it would be interesting to see how many catalogs could be required at each level. using this information, you follow these steps to determine how many catalogs you need: 1. build a range with different values for alpha in column a. 2. end the formula binom.inv(b2,b1,a8) in cell b8. 3. copy the formula from cell b8 to the other cells in column b. as shown in figure 14.32, you need to have 368 catalogs for the trade show. note in legacy versions of excel, binom.inv was called critbinom. figure 14.32 based on response rates at last year’s trade show, you can use binom.inv to predict how many catalogs to print. using negbinom.dist to calculate probability it is a fact that lebron james has a career free-throw percentage of 0.741. what are the odds that james would miss three free throws before he makes one free throw? you can use excel’s negbinom.dist function to figure this out. syntax: negbinom.dist( number_f,number_s,probability_s ) the negbinom.dist function returns the negative binomial distribution. it returns the probability that there will be number_f failures before the number_s th success, when the constant probability of a success is probability_s.451 examples of funct ions for inferent ial stat i st i cs 14 chapter this function is similar to the binomial distribution function, except that the number of successes is fixed, and the number of trials is variable. as with the binomial distribution function, trials are assumed to be independent. for example, you need to find 10 people who have excellent reflexes, and you know the probability that a candidate has these qualifications is 0.3. negbinom.dist calcu-lates the probability that you will interview a certain number of unqualified candidates before find-ing all 10 qualified candidates. the negbinom.dist function takes the following arguments: • number_f—this is the number of failures. • number_s —this is the threshold number of successes. • probability_s—this is the probability of a success. number_f and number_s are truncated to integers. if any argument is nonnumeric, negbinom. dist returns a #value! error. if probability_s is less than 0 or if probability is greater than 1, negbinom.dist returns a #num! error. if (number_f number_s – 1) is less than or equal to 0, negbinom.dist returns a #num! error. to solve the lebron james problem, you use negbinom.dist(3,1,0.741,0). the answer is a 1.28% probability. using poisson.dist to predict a number of discrete events over time suppose you have to predict the number of discrete events that will happen over a certain period of time. this might be the number of customers who walk into a bank in an hour. it might be the num-ber of lightning strikes on the john hancock building in a year. (it can also be discrete events that occur in a certain distance or area or any other measurement.) unlike the binomial distribution, in which an event either happens or does not happen, the poisson distribution can be zero, one, two, three, and so on events in the period. the nature of the poisson distribution is that before the third customer can walk into the bank, the second customer has to walk into the bank. in theory, if you had a run on the bank, the upper limit would be the number of total account holders, but in practice, there is probably some logical upper limit to how many cus-tomers walk in, such as the number that walk in during a friday payday lunch hour. if you measure the average number of customers per hour over the several weeks, you can use this number to predict the likelihood that a particular number of customers will enter the bank in any hour by using the poisson.dist function. syntax: poisson.dist( x,mean,cumulative ) the poisson.dist function returns the poisson distribution. a common application of the poisson distribution is predicting the number of events over a specific time, such as the number of cars arriving at a toll plaza in one minute. this function takes the following arguments: note in excel 2007, this function was called negbinomdist.us ing stat i st i cal funct ions 452 2 part • x—this is the number of events. • mean—this is the expected numeric value. • cumulative—this is a logical value that determines the form of the probability distribution returned. if x is not an integer, it is truncated. if x or mean is nonnumeric, poisson.dist returns a #value! error. if x is less than or equal to u, poisson.dist returns a #num! error. if mean is less than or equal to 0, poisson.dist returns a #num! error. if cumulative is true, poisson.dist returns the cumulative poisson probability that the number of random events occurring will be between 0 and x, inclusive; if cumulative is false, it returns the poisson probability mass function that the number of events occurring will be exactly x. to solve the bank customer example, follow these steps: 1. calculate the mean number of customers entering the bank per hour over several weeks. enter this in cell b1 of the worksheet. 2. in a4:a24, enter the numbers from 0 to 20. 3. in column b, calculate the probability that exactly n customers will enter the bank. in cell b4, enter the formula poisson.dist(a4,b1,false). 4. in column c, calculate the probability that 0 to n customers will enter the bank. in cell c4, enter the formula poisson.dist(a4,b1,true). in figure 14.33, you can see that 84% of the time, your number of customers is expected to be between 0 and 11 customers per hour. if you staff up to handle 11 customers per hour, you should be covered 85% of the time. using frequency to categorize continuous data the past few examples count whole numbers. it would be fairly difficult to have 0.3 persons walk into a bank. the outcome from the poisson distribution would therefore have to be a whole number. other measurements are continuous. the speed of a car passing a checkpoint is an example. depending on the accuracy of the radar unit, a car could be determined to be going 55.1, 55.2, 55.3, 55.4 and so on miles per hour. it would not make sense to try to predict how many cars will be going exactly 55.0123 miles per hour. if you did, you would be lucky to have a height of 2 for any point along the continuous scale. typically, the prediction question would be, what percentage of cars are likely to be going between 65 and 70 miles per hour? when you are working with a continuous range of measurements, the normal procedure is to group the measurements into ranges. statisticians call each range a bin. note in excel 2007, this function was called poisson.453 examples of funct ions for inferent ial stat i st i cs 14 chapter in figure 14.34, the left chart shows the frequency curve for the speed of 2,000 cars passing a highway checkpoint. the recording unit measured speeds to the accuracy of 0.1 mile. the curve is incredibly noisy, with intense variation from point to point. figure 14.33 you can figure the number of customers per hour by using poisson.dist. figure 14.34 with con-tinuous vari-ables, you can group the observed values into bins to see the underly-ing distribu-tion curve emerge.us ing stat i st i cal funct ions 454 2 part the middle chart shows the frequency curve after the data has been fit into bins of 1 mph each. there is still some noise in the distribution. for some reason, fewer people happened to be going 56 mph. the right chart shows the frequency curve after the data has been fit into bins of 5 mph each. this curve is very smooth and shows that the data points seem to follow the normal bell curve. the process of grouping data into bins is handled with another array function: the frequency function. syntax: frequency( data_array,bins_array ) the frequency function calculates how often values occur within a range of values, and it returns a vertical array of numbers. for example, you can use frequency to count the number of test scores that fall within ranges of scores. because frequency returns an array, it must be entered as an array formula. the frequency function takes the following arguments: • data_array—this is an array of or a reference to a set of values for which you want to count frequencies. if data_array contains no values, frequency returns an array of zeros. • bins_array—this is an array of or a reference to intervals into which you want to group the values in data_array. if bins_array contains no values, frequency returns the number of ele-ments in data_array. you enter frequency as an array formula after you select a range of adjacent cells into which you want the returned distribution to appear. the number of elements in the returned array is one more than the number of elements in bins_ array. the extra element in the returned array returns the count of any values above the highest interval. for example, when counting three ranges of values (intervals) that are entered into three cells, you need to be sure to enter frequency into four cells for the results. the extra cell returns the number of values in data_array that are greater than the third interval value. frequency ignores blank cells and text. to use the frequency function, follow these steps: 1. figure out the expected range of values in the original data set. you can do this by sorting the data set or by using the min and max functions. 2. decide on your bin sizes. each bin should be roughly the same size. use enough bins to get an accurate picture but not so many bins that the data becomes spiky and noisy. in figure 14.35, the goal was bins of 5 mph each. 3. enter the bins. this process is a bit tricky. if you want a bin for 40–45 mph, enter the number 45. for the bin of 45–50 mph, enter the number 50. in c2–c10, the numbers represent bins starting with 40–45 and ending with 80–85. 4. select the range where the values will be returned. (the frequency function returns several val-ues at once.) in figure 14.35, select cells d2:d11. notice that this selection is one cell larger than your range of bins. the function returns one extra value in case there are any speeds faster than your top bin speed.455 examples of funct ions for inferent ial stat i st i cs 14 chapter 5. with d2:d11 selected, type the formula frequency(a2:a2001,c2:c10). do not press enter at the end. you have to tell excel to evaluate the formula as an array formula, so hold down ctrlshift and then press enter. excel automatically groups the 2,000 individual data points into the 10 bins. you can then chart or analyze this range. to see a demo of using frequency, search for “excel in depth 14” at youtube. using norm.dist to calculate the probability in a normal distribution in figure 14.36, the observed speeds along a highway seem to be following a normal distribution. a normal distribution is sometimes referred to as a bell curve. when you have a normal distribution, the curve can be described mathematically using only the average and standard deviation of the data. figure 14.35 the tedious process of grouping values into ranges is handled easily with the frequency function. figure 14.36 if your data is normally dis-tributed, you can predict the future by using norm.dist. the norm.dist function has a strange twist: it always returns the probability that a car will be going less than or equal to a value x. if you want to know the probability that the next car will be traveling between 65 and 75 mph, you have to figure out the cumulative probability of the car going less than 75 miles per hour and then subtract the cumulative probability of the car going less than 65 miles per hour. this requires two calls to the norm.dist function.us ing stat i st i cal funct ions 456 2 part syntax: norm.dist( x,mean,standard_dev,cumulative ) the norm.dist function returns the normal cumulative distribution for the specified mean and stan-dard deviation. this function has a very wide range of applications in statistics, including hypoth-esis testing. this function takes the following arguments: • x—this is the value for which you want the distribution. • mean—this is the arithmetic mean of the distribution. • standard_dev—this is the standard deviation of the distribution. • cumulative—this is a logical value that determines the form of the function. if cumulative is true, norm.dist returns the cumulative distribution function; if cumulative is false, norm. dist returns the probability mass function. if mean or standard_dev is nonnumeric, norm.dist returns a #value! error. if standard_dev is less than or equal to 0, norm.dist returns a #num! error. if mean is 0 and standard_dev is 1, norm.dist returns the standard normal distribution. in figure 14.36, the range of observed values is in a2:a2001. formulas in cells d1 and d2 calculate the average and standard deviation of the data set. the goal is to find the probability of any car going between 65 and 75 mph. the formula in cell f4 is norm. dist(75,d1,d2,true); it predicts the likelihood of a car going 75 mph or less at 95.3%. the formula in cell f5 is norm. dist(65,d1,d2,true). this predicts the probability of a car going 65 mph or less at 50.2%. you can back into the probability that the car will be going between 65 and 75 mph by subtracting 50.2% from 95.3%. the answer to your problem is 45.1% that the next car passing the checkpoint will be going between 65 and 75 mph. using norm.inv to calculate the value for a certain probability in the preceding section, you used norm.dist to find the probability that a car was going less than 75 mph. sometimes, you might want to find the speed associated with a certain probability. for example, say you need to design a billboard that can be read by 80% of the drivers. if you know the mean and standard deviation of the speeds on the highway, you can use the norm.inv function to ask excel to tell you that 80% of the drivers will be driving at x miles per hour or less. syntax: norm.inv( probability,mean,standard_dev ) the norm.inv function returns the inverse of the normal cumulative distribution for the specified mean and standard deviation. this function takes the following arguments: • probability—this is a probability corresponding to the normal distribution. • mean—this is the arithmetic mean of the distribution. note in excel 2007, this function was called normdist.457 examples of funct ions for inferent ial stat i st i cs 14 chapter • standard_dev—this is the standard deviation of the distribution. if any argument is nonnumeric, norm.inv returns a #value! error. if probability is less than 0 or if probability is greater than 1, norm.inv returns a #num! error. if standard_dev is less than or equal to 0, norm.inv returns a #num! error. norm.inv uses an iterative technique for calculating the function. given a probability value, norm. inv iterates until the result is accurate to within 3 10–7. if norm.inv does not converge after 100 iterations, the function returns an #n/a error. in figure 14.37, a sample of speeds is listed in column a. the formulas in cells d2 and d3 calculate the mean and standard deviation. if you assume that the speeds follow a normal distribution, then 80% of the cars will be traveling 70 mph or less along this stretch of highway. the formula in cell e6 is norm.inv(d6,d1,d2). figure 14.37 rather than use goal seek with the norm.dist function, you can let excel handle the iterations to back into an answer using norm.inv. using norm.s.dist to calculate probability before the days of spreadsheets, most statistics textbooks had tables of probabilities. in such a textbook, the basic problem states, for example, that the mean is 57.1 and the standard deviation is 8.2. to calculate the probability that a member of the population would have a value of 64 or less, your first step is to calculate a z value. z is simply the number of standard deviations away from the mean. in this case, 64 is 6.9 units above the mean. the standard deviation is 8.2. your z score is 6.9 / 8.2, or 0.841. thus, you need to find the probability that any value is at 0.841 standard deviations above the mean or less. you then turn to a large appendix in the back of the textbook that lists many different z scores and the probability associated with each one. the table would look somewhat like figure 14.38. depending on the accuracy of the table, you could find the probability associated with the z score. in figure 14.38, you would go down the left column to the 0.8 row and across the table to the 0.04 column to find a value of 0.7995. this means that there is a 0.7995 probability that any random member will be at 0.84 standard deviations above the mean or below it. the norm.s.dist function makes this table obsolete. (in fact, i created the table in the figure by using norm.s.dist). although the typical statistics textbook would show the approximate probabil-ity for z 0.84 as 0.7995, excel can now calculate the exact probability for z 0.841 as 0.7998. note in excel 2007, norm.inv was norminv.us ing stat i st i cal funct ions 458 2 part syntax: norm.s.dist( z,cumulative ) the norm.s.dist function returns the standard normal distribution function. the distribution has a mean of 0 and a standard deviation of 1. you use this function in place of a table of standard normal curve areas. the argument z is the value for which you want the distribution. if z is nonnumeric, norm.s.dist returns a #value! error. changes to norm.s.dist function in excel 2010 in excel 2007, the normsdist function was always cumulative. in the process of rewriting the func-tion for excel 2010, microsoft added the cumulative argument. if you use true or 1 for this argu-ment, norm.s.dist calculates the cumulative distribution function just as normsdist would have done. with excel 2010, if you choose false or 0 for the cumulative function, norm.s.dist will return the amount from the mean to the specific point. an example might make this clear. to find out how many items appear at the mean, the cumulative norm.s.dist will report 50%. to find how many items are .26 above the mean, the cumulative function will report 60% of the items. if you ask for the noncumulative function, you will find only the amount from the mean to the z score of 0.26, which would be 10%. using norm.s.inv to calculate a z score for a given probability to calculate a z score for a given probability, you use the norm.s.inv function. in figure 14.39, the z score for 15% is –1.036. this means that in a normally distributed population, 15% of the popula-tion exists at the value of the mean minus 1.036 standard deviations. figure 14.38 the norm.s.dist formula in cell c17 makes tables of probabilities in statistics textbooks (like the one displayed in a2:k14) obsolete.459 examples of funct ions for inferent ial stat i st i cs 14 chapter syntax: norm.s.inv( probability ) the norm.s.inv function returns the inverse of the standard normal cumulative distribution. the distribution has a mean of 0 and a standard deviation of 1. the argument probability is a probability that corresponds to the normal distribution. if probability is nonnumeric, normsinv returns a #value! error. if probability is less than 0 or if probability is greater than 1, norm.s.inv returns a #num! error. norm.s.inv uses an iterative technique for calculating the function. given a probability value, norm.s.inv iterates until the result is accurate to within 3 10–7. if norm.s.inv does not con-verge after 100 iterations, the function returns an #n/a error. the z score refers to a number of standard deviations away from the mean. if the z score is negative, the value lies to the left of the mean. if the z score is positive, the value lies to the right of the mean. using standardize to calculate the distance from the mean to calculate the distance from a mean, use the standardize function. this function returns the positive or negative distance from the mean, expressed as the number of standard deviations. syntax: standardize( x,mean,standard_dev ) the standardize function returns a normalized value from a distribution characterized by mean and standard_dev. this function takes the following arguments: • x—this is the value you want to normalize. • mean—this is the arithmetic mean of the distribution. • standard_dev—this is the standard deviation of the distribution. if standard_dev is less than or equal to 0, standardize returns a #num! error. in figure 14.40, a population has a mean of 65 and a standard deviation of 5. the normalized value of 75 is 2, indicat-ing that 75 is 2 standard deviations away from the mean of 65. figure 14.39 you can back into a z score from a probability by using norm.s.inv. you can then take the z score multiplied by a standard deviation to figure out the distance that your value lies from the mean.us ing stat i st i cal funct ions 460 2 part using student’s t-distribution for small sample sizes all the previous examples using a normal distribution assume that the sample size is 30 or more. if you are using a small sample size—even as small as three members—you should use the student’s t-distribution. an important concept in the student’s t-distribution is the degrees of freedom. if you know the mean of the sample but not the standard deviation of the population, the degrees of freedom is the sample size minus 1. when the degrees of freedom is 29 or above, the student’s t-distribution is nearly identical with the normal distribution. however, as the degrees of freedom drops, the distri-bution becomes flatter and wider. changes to tdist function in excel 2010 microsoft added several functions to excel 2010. in legacy versions of excel, the tdist function required three arguments: x, degrees of freedom, and # of tails. in excel 2010, microsoft moved the # of tails to the function name. in excel 2007, the two-tailed tdist function tdist(2.5,10,2) this is now equivalent to excel 2010’s t.dist.2t function: t.dist.2t(2.5,10) in excel 2007, the one-tailed tdist function tdist(2.5,10,1) this is now equivalent to the t.dist.rt function: t.dist.rt(2.5,10) excel 2010 offers a new t.dist function with arguments for x, degrees of freedom, and cumulative. figure 14.40 standardize does the basic math to calculate the distance from the mean, expressed as a number of standard deviations.461 examples of funct ions for inferent ial stat i st i cs 14 chapter syntax: t.dist( x,degrees_freedom,cumulative ) t.dist.2t( x,degrees_freedom ) t.dist.rt( x,degrees_freedom ) the t.dist.2t function returns the percentage points (that is, probability) for the student’s t-dis-tribution, where a numeric value (x) is a calculated value of t for which the percentage points are to be computed. the t-distribution is used in the hypothesis testing of small sample data sets. you use this function in place of a table of critical values for the t-distribution. the t.dist.2t function takes the following arguments: • x—this is the numeric value at which to evaluate the distribution. • degrees_freedom—this is an integer that indicates the number of degrees of freedom. if any argument is nonnumeric, t.dist returns a #value! error. if degrees_freedom is less than 1, t.dist returns a #num! error. the degrees_freedom argument are truncated to integers. syntax: t.inv.2t( probability,degrees_freedom ) the t.inv.2t function returns the t-value of the student’s t-distribution as a function of the prob-ability and the degrees of freedom. this function takes the following arguments: • probability—this is the probability associated with the two-tailed student’s t-distribution. • degrees_freedom—this is the number of degrees of freedom to characterize the distribution. if either argument is nonnumeric, t.inv.2t returns a #value! error. if probability is less than 0 or if probability is greater than 1, t.inv.2t returns a #num! error. if degrees_freedom is not an integer, it is truncated. if degrees_freedom is less than 1, t.inv.2t returns a #num! error. t.inv.2t is calculated as t.inv.2t p( tx ), where x is a random variable that follows the t-distribution. t.inv.2t uses an iterative technique for calculating the function. given a probability value, t.inv.2t iterates until the result is accurate to within 3 10–7. if t.inv.2t does not converge after 100 iterations, the function returns an #n/a error. syntax: t.test( array1 , array2,tails,type ) excel can also calculate the t-test to predict whether two samples come from populations with the same mean. for this, you use the t.test function. the t.test function returns the probability associated with a student’s t-test. you use t.test to determine whether two samples are likely to have come from the same two underlying populations that have the same mean. note excel 2010’s t.inv.2t is equiva-lent to tinv in legacy versions of excel. the excel 2010 function t.inv is a new function to pro-vide the inverse of the cumula-tive t.dist function.us ing stat i st i cal funct ions 462 2 part the t.test function takes the following arguments: • array1—this is the first data set. • array2—this is the second data set. • tails—this specifies the number of distribution tails. if tails is 1, ttest uses the one-tailed distri-bution. if tails is 2, ttest uses the two-tailed distribution. • type—this is the kind of t-test to perform. see table 14.3 for more information. table 14.3 types of t -tests available with the t.test function if type equals this test is performed 1 paired 2 two-sample equal variance (homoscedastic) 3 two-sample unequal variance (heteroscedastic) if array1 and array2 have a different number of data points, and if type is 1 (paired), t.test returns an #n/a error. the tails and type arguments are truncated to integers. if tails or type is non-numeric, t.test returns a #value! error. if tails is any value other than 1 or 2, t.test returns a #num! error. in figure 14.41, the means of the two samples are different: 11.15 versus 13.5. however, in cell f2, t.test returns 0.1577. because this is greater than the typical alpha of 0.05, the difference in means may not be statistically significant. it is possible that these two samples were taken from the same population. figure 14.41 t.test provides a formulaic equivalent to the key result from the analysis toolpak’s t-test feature.463 examples of funct ions for inferent ial stat i st i cs 14 chapter using chisq.test to perform goodness-of-fit testing a chi-squared test compares expected frequencies with observed frequencies. the chisq.test function performs the chi-square test for independence. chisq.inv.rt is used to find the critical chi value for a certain probability and degrees of freedom. chisq.dist.rt is used to determine the probability for a chi value and certain degrees of freedom. syntax: chisq.test( actual_range , expected_range ) the chisq.test function returns the test for independence. chisq.test returns the value from the chi-squared distribution for the statistic and the appropriate degrees of freedom. you can use chi-squared tests to determine whether hypothesized results are verified by an experiment. the chisq.test function takes the following arguments: • actual_range—this is the range of data that contains observations to test against expected values. • expected_range —this is the range of data that contains the ratio of the product of row totals and column totals to the grand total. if actual_range and expected_range have different numbers of data points, chisq.test returns an #n/a error. the chi-squared test first calculates a chi-squared statistic and then sums the differences of actual values from the expected values. chisq.test returns the probability for a chi-squared statistic and degrees of freedom, df, where df (r – 1) (c – 1). syntax: chisq.dist.rt( x,degrees_freedom ) the chisq.dist.rt function returns the one-tailed probability of the chi-squared distribution. the chi-squared distribution is associated with a chi-squared test. you use the chi-squared test to com-pare observed and expected values. for example, a genetic experiment might hypothesize that the next generation of plants will exhibit a certain set of colors. by comparing the observed results with the expected ones, you can decide whether your original hypothesis is valid. the chisq.dist.rt function takes the following arguments: • x —this is the value at which you want to evaluate the distribution. • degrees_freedom—this is the number of degrees of freedom. if either argument is nonnumeric, chisq.dist.rt returns a #value! error. if x is negative, chisq.dist.rt returns a #num! error. if degrees_freedom is not an integer, it is truncated. if note the excel 2010 t.test function is equivalent to ttest in legacy versions of excel.us ing stat i st i cal funct ions 464 2 part degrees_freedom is less than 1 or if degrees_freedom is greater than or equal to 1010, chisq. dist.rt returns a #num! error. syntax: chisq.inv.rt( probability , degrees_freedom ) the chisq.inv.rt function returns the inverse of the one-tailed probability of the chi-squared distribution. if probability equals chisq.dist.rt(x,...), chisq.inv.rt(probability,...) equals x. you use the chisq.inv.rt function to compare observed results with expected ones to decide whether your original hypothesis is valid. the chisq.inv.rt function takes the following arguments: • probability—this is a probability associated with the chi-squared distribution. • degrees_freedom—this is the number of degrees of freedom. if either argument is nonnumeric, chisq.inv.rt returns a #value! error. if probability is less than 0 or probability is greater than 1, chisq.inv.rt returns a #num! error. if degrees_freedom is not an integer, it is truncated. if degrees_freedom is less than 1 or if degrees_freedom is greater than or equal to 1010, chisq.inv.rt returns a #num! error. chisq.inv.rt uses an iterative technique for calculating the function. given a probability value, chisq.inv.rt iterates until the result is accurate to within 3 10–7. if chisq.inv does not con-verge after 100 iterations, the function returns an #n/a error. figure 14.42 you can calculate chi-squared testing with chisq.test. the sum of squares functions excel offers four functions with confusingly similar names. the hardest part of using these func-tions is figuring out which function does what. the first three functions require two identically sized arrays, named x and y. these are excel’s four sum of squares functions:465 examples of funct ions for inferent ial stat i st i cs 14 chapter • sumx2my2—for each pair of x and y, excel calculates x2 – y2 and then sums these values. in this case, the m in the function name indicates minus. • sumx2py2—for each pair of x and y, excel calculates x2 y2 and then sums these values. in this case, the p in the function name indicates plus. • sumxmy2—for each pair of x and y, excel calculates (x – y)2 and then sums these values. again, the m indicates minus, and the lack of a 2 after the x indicates that it is the difference that is squared. • sumsq—returns the sum of the squares of the arguments. in figure 14.43, the x array is in a2:a5, and the y array is in b2:b5. the formulas in d2:d5 calculate x – y for each pair. the formulas in e2:e5 square that difference for each pair. the formula in cell e6 totals the sum of the squares. you could replace the five formulas in column e with a single formula in cell d6: sumsq(d2:d5). alternatively, you could replace all the formulas in columns d and e with a single use of sumxmy2 in cell b9. sum of the sum of the squares some statistical processes ask you to calculate the sum of the sum of the squares. however, a casual survey of several mathematicians could not find one concrete example of when this would be useful. in fact, the formula for sumx2py2(a2:a5,b2:b5) is mathematically equivalent to sumsq(a2:b5). further, the formula for sumx2my2(a2:a5,b2:b5) is the same as sumsq(a2:a5)- sumsq(b2:b5). my theory on this is that some early spreadsheets included these functions in an effort to claim that they had more functions than a competitor did. all future spreadsheets have included the functions just because some other competitor included them. syntax: sumsq( number1,number2,... ) the sumsq function returns the sum of the squares of the arguments. the arguments number1,number2,... are 1 to 255 arguments for which you want the sum of the squares. you can also use a single array or a reference to an array instead of arguments separated by commas. syntax: sumxmy2( array_x , array_y ) the sumxmy2 function returns the sum of squares of differences of corresponding values in two arrays. it takes the following arguments: • array_x —this is the first array or range of values. • array_y —this is the second array or range of values. the arguments should be either numbers or names, arrays, or references that contain numbers. if an array or reference argument contains text, logical values, or empty cells, those values are ignored;us ing stat i st i cal funct ions 466 2 part however, cells that contain the value 0 are included. if array_x and array_y have a different number of values, sumxmy2 returns an #n/a error. syntax: sumx2my2( array_x , array_y ) the sumx2my2 function returns the sum of the difference of squares of corresponding values in two arrays. this function takes the following arguments: • array_x—this is the first array or range of values. • array_y—this is the second array or range of values. the arguments should be either numbers or names, arrays, or references that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if array_x and array_y have a different num-ber of values, sumx2my2 returns an #n/a error. syntax: sumx2py2( array_x , array_y ) the sumx2py2 function returns the sum of the sum of squares of corresponding values in two arrays. the sum of the sum of squares is a common term in many statistical calculations. this function takes the following arguments: • array_x—this is the first array or range of values. • array_y—this is the second array or range of values. the arguments should be either numbers or names, arrays, or references that contain numbers. if an array or reference argument contains text, logical values, or empty cells, those values are ignored; however, cells that contain the value 0 are included. if array_x and array_y have a different num-ber of values, sumx2py2 returns an #n/a error. figure 14.43 without doing any regression, you use sumxmy2 to calculate the sum of the squares of the difference of two arrays. note i will go out on a limb and pro-pose that none of the 500 million people using excel actually use sumx2py2.467 examples of funct ions for inferent ial stat i st i cs 14 chapter in figure 14.44, the one sumx2py2 formula in cell b9 is much simpler than the five formulas in column d, but if you really wanted to do this calculation, you could use sumsq, as shown in cell b12. figure 14.44 microsoft won’t say that these two functions are use-ful in statistics. i don’t think they are useful anywhere. figure 14.45 the lognorm.dist and lognorm.inv functions can make sense of a population where the natural logarithm of the population is normally distributed. testing probability on logarithmic distributions in the life sciences, a number of populations have logarithmic distributions. in the population shown in figure 14.45, the values in the sample range from under 2 to over 38,000. the data clearly does not follow a normal distribution.us ing stat i st i cal funct ions 468 2 part however, if you took the natural logarithm of each data point, the ln(x) of the members does follow a normal distribution. the mean of the natural logarithms is 5, with a standard deviation of 2. populations where the natural logarithm is normally distributed are called lognormal distributions. an example of a population with a lognormal distribution is the length of time that bacteria live in a disinfectant. in the example where the mean of the natural logarithm values is 5 and the standard deviation is 2, take a look at what this really means: you use exp(5) to see that the mean of 5 translates to 148. you would expect 65% of the population to be within 1 standard deviation of the mean. this range from exp(3) to exp(7) is from 20 to 1,096. the range for two standard deviations from the mean is exp(1) and exp(9), or 2.7 and 8,103. given a lognormal distribution where the mean of the natural logarithm of the population is 5 and the standard deviation is 2, you can predict what percentage of the population will be at a number x or below by using lognorm.dist. to find the value of x associated with a certain probability, you use lognorm.inv. syntax: lognorm.dist( x,mean,standard_dev,cumulative ) the lognorm.dist function returns the cumulative lognormal distribution of x, where the natural logarithm is normally distrib-uted with the parameters mean and standard_dev. you use this function to analyze data that has been logarithmically transformed. this function takes the following arguments: • x—this is the value at which to evaluate the function. • mean—this is the mean of the natural logarithm. • standard_dev—this is the standard deviation of the natural logarithm. • cumulative—use 1 or true to calculate the cdf. use 0 or false to calculate the pdf. if any argument is nonnumeric, lognorm.dist returns a #value! error. if x is less than or equal to 0 or if standard_dev is less than or equal to 0, lognorm.dist returns a #num! error. syntax: lognorm.inv( probability,mean,standard_dev ) the lognorm.inv function returns the inverse of the lognormal cumulative distribution function of x, where the natural logarithm is normally distributed with the parameters mean and standard_ dev. if probability is equal to lognorm.dist(x,...), lognorm.inv(probability,...) is equal to x. you use the lognormal distribution to analyze logarithmically transformed data. the lognorm.inv function takes the following arguments: • probability—this is a probability associated with the lognormal distribution. • mean—this is the mean of the natural logarithm. note lognorm.inv in excel 2010 is equivalent to loginv in excel 2007.469 examples of funct ions for inferent ial stat i st i cs 14 chapter • standard_dev—this is the standard deviation of the natural logarithm. if any argument is nonnumeric, lognorm.inv returns a #value! error. if probability is less than 0 or if probability is greater than 1, lognorm.inv returns a #num! error. if standard_dev is less than or equal to 0, lognorm.inv returns a #num! error. in figure 14.45, the population varies from 1.7 to 38,577, but the lognorm.dist function predicts that 72.8% of the population is under 500, 87.6% is under 1,500, and 96.1% is under 5,000. in cell t61, the lognorm.inv function reveals that 95% of the population should be under 3,983. using gamma.dist and gamma.inv to analyze queuing times earlier in this chapter, we discussed how to use a poisson distribution to analyze how many cus-tomers might walk into a bank during any given hour. however, if the time between customers is relevant, you need to use the gamma distribution. the gamma distribution is described by two variables—alpha and beta. for a gamma distribution described by alpha and beta, you can find the probability that a value of x or less will occur with gamma.dist. to find the value of x for a certain probability, you use gamma.inv. the other remaining gamma-related function is gammaln. syntax: gamma.dist( x,alpha,beta,cumulative ) the gamma.dist function returns the gamma distribution. you can use this function to study vari-ables that may have a skewed distribution. the gamma distribution is commonly used in queuing analysis. this function takes the following arguments: • x—this is the value at which you want to evaluate the distribution. • alpha—this is a parameter to the distribution. • beta—this is a parameter to the distribution. • cumulative—this is a logical value that determines the form of the function. if cumulative is true, gamma.dist returns the cumulative distribution function; if cumulative is false, gamma. dist returns the probability mass function. if beta is 1, gamma.dist returns the standard gamma distribution. if x, alpha, or beta is nonnu-meric, gamma.dist returns a #value! error. if x is less than 0, gamma.dist returns a #num! error. if alpha is less than or equal to 0 or if beta is less than or equal to 0, gamma.dist returns a #num! error. when alpha is a positive integer, gamma.dist is also known as the erlang distribution. note in excel 2007, the normdist function always calculated the cdf. when you switch to the lognorm.dist function, you need to add a fourth argument of 1 or true to calculate the equiva-lent function.us ing stat i st i cal funct ions 470 2 part syntax: gamma.inv( probability,alpha,beta ) the gamma.inv function returns the inverse of the gamma cumulative distribution. if probability is equal to gamma.dist(x,...), then gamma.inv(probability,...) is equal to x. you can use this function to study a variable whose distribution may be skewed. this function takes the following arguments: • probability—this is the probability associated with the gamma distribution. • alpha—this is a parameter to the distribution. • beta—this is a parameter to the distribution. if beta is 1, gamma.inv returns the standard gamma distribution. if any argument is nonnumeric, gamma.inv returns a #value! error. if probability is less than 0 or probability is greater than 1, gamma.inv returns a #num! error. if alpha is less than or equal to 0 or if beta is less than or equal to 0, gamma.inv returns the #num! error. if beta is less than or equal to 0, gamma.inv returns a #num! error. gamma.inv uses an iterative technique to do its calculation. given a probability value, gamma.inv iterates until the result is accurate to within 3 10–7. if gamma.inv does not converge after 100 iterations, the function returns an #n/a error. syntax: gammaln( x ) the gammaln function returns the natural logarithm of the gamma function, (x). the argument x is the value for which you want to calculate gammaln. if x is nonnumeric, gammaln returns a #value! error. if x is less than or equal to 0, gammaln returns a #num! error. the number e raised to the gammaln(i) power, where i is an integer, returns the same result as (i – 1)!. calculating probability of beta distributions a beta distribution is used to describe the variability of the per-centage of something across samples, such as the percentage of the day people spend sleeping. a beta distribution curve is described by two parameters, alpha and beta. for any given distribu-tion, you can predict the likelihood that a value will be less than or equal to x by using beta.dist. to find the value of x associated with a certain probability, you use beta.inv. note the excel 2010 function gamma. dist is equivalent to the excel 2007 gammadist function. note beta.dist in excel 2010 is simi-lar to betadist in legacy ver-sions of excel. betadist always calculated the cdf. to switch from betadist to beta.dist, add a 1 as the third argument to indicate that the function should be cumulative.471 examples of funct ions for inferent ial stat i st i cs 14 chapter syntax: beta.dist( x,alpha,beta,cumulative,a,b ) the beta.dist function returns the cumulative beta probability density function. the cumulative beta probability density function is commonly used to study variation in the percentage of some-thing across samples, such as the fraction of the day people spend watching television. this func-tion takes the following arguments: • x—this is the value between a and b at which to evaluate the function. • alpha—this is a parameter to the distribution. • beta—this is a parameter to the distribution. • cumulative—true or 1 for the cdf, false or 0 for the pdf. • a—this is an optional lower bound to the interval of x. • b—this is an optional upper bound to the interval of x. if any argument is nonnumeric, beta.dist returns a #value! error. if alpha is less than or equal to 0 or beta is less than or equal to 0, beta.dist returns a #num! error. if x is less than a, x is greater than b, or a equals b, beta.dist returns a #num! error. if you omit values for a and b, beta.dist uses the standard cumulative beta distribution, so that a equals 0 and b equals 1. syntax: beta.inv( probability,alpha,beta,a,b ) the beta.inv function returns the inverse of the cumulative beta probability density function. that is, if probability is equal to betadist(x,...), then beta.inv(probability,...) is equal to x. the cumulative beta distribution can be used in project planning to model probable completion times, given an expected completion time and variability. the beta.inv function takes the following arguments: • probability—this is a probability associated with the beta distribution. • alpha—this is a parameter to the distribution. • beta—this is a parameter to the distribution. • a—this is an optional lower bound to the interval of x. • b—this is an optional upper bound to the interval of x. if any argument is nonnumeric, beta.inv returns a #value! error. if alpha is less than or equal to 0 or if beta is less than or equal to 0, beta.inv returns a #num! error. if probability is less than or equal to 0 or probability is greater than 1, beta.inv returns a #num! error. if you omit values for a and b, beta.inv uses the standard cumulative beta distribution, so that a equals 0 and b equals 1. beta.inv uses an iterative technique for calculating the function. given a probability value, beta. inv iterates until the result is accurate to within 3 10 – 7. if beta.inv does not converge after 100 iterations, the funcion returns an #n/a error.us ing stat i st i cal funct ions 472 2 part using f.test to measure differences in variability there are three functions for measuring variability among two populations. suppose you need to compare test results from males and test results from females. to determine whether one population has more variability than the other, you use f.test. the f.dist function determines the probabil-ity that a value will be less than or equal to x. the f.inv func-tion returns the x value associated with a certain probability. syntax: f.test( array1 , array2 ) the f.test function returns the result of an f-test. an f-test returns the one-tailed probability that the variances in array1 and array2 are not significantly different. you use this function to determine whether two samples have different variances. for example, given test scores from public and private schools, you can test whether these schools have different levels of diversity. the f.test function takes the following arguments: • array1—this is the first array or range of data. • array2—this is the second array or range of data. the arguments must be numbers or names, arrays, or references that contain numbers. if an array or a reference argument contains text, logical values, or empty cells, those values are ignored. however, cells that contain the value 0 are included. if the number of data points in array1 or array2 is less than 2, or if the variance of array1 or array2 is 0, f.test returns a #div/0! error. syntax: f.dist.rt( x,degrees_freedom1,degrees_freedom2 ) the f.dist.rt function returns the f probability distribution. you can use this function to deter-mine whether two data sets have different degrees of diversity. for example, you can examine test scores given to men and women entering high school and deter-mine whether the variability in the females is different from that found in the males. the f.dist.rt function takes the following arguments: • x —this is the value at which to evaluate the function. • degrees_freedom 1—this is the numerator degrees of freedom. • degrees_freedom2 —this is the denominator degrees of freedom. if any argument is nonnumeric, f.dist.rt returns a #value! error. if x is negative, f.dist.rt returns a #num! error. if degrees_freedom 1 or degrees_freedom 2 is not an integer, it is truncated. if degrees_freedom 1 is less than 1 or degrees_freedom 1 is greater than or equal to 1010, f.dist.rt returns a #num! error. if note an f.test in excel 2010 is equivalent to ftest in legacy versions of excel. note f.dist.rt in excel 2010 is equivalent to fdist in legacy versions of excel. in excel 2010, f.dist is a new function to return the pdf or cdf of the f distribution.473 examples of funct ions for inferent ial stat i st i cs 14 chapter degrees_freedom 2 is less than 1 or degrees_freedom 2 is greater than or equal to 1010, f.dist. rt returns a #num! error. f.dist.rt is calculated as f.dist.rtp( fx ), where f is a random variable that has an f distribution. syntax: f.inv.rt ( probability, degrees_freedom1,degrees_freedom2 ) the f.inv.rt function returns the inverse of the f probability distribution. if probability is equal to f.dist.rt(x,...), then f.inv.rt (probability,...) is equal to x. the f distribution can be used in an f-test that compares the degree of variability in two data sets. for example, you can ana-lyze income distributions in the united states and canada to determine whether the two countries have a similar degree of diversity. this function takes the following arguments: • probability—this is a probability associated with the f cumulative distribution. • degrees_freedom 1—this is the numerator degrees of freedom. • degrees_freedom 2—this is the denominator degrees of freedom. if any argument is nonnumeric, f.inv.rt returns a #value! error. if probability is less than 0 or probability is greater than 1, f.inv.rt returns a #num! error. if degrees_freedom 1 or degrees_ freedom2 is not an integer, it is truncated. if degrees_freedom1 is less than 1 or degrees_freedom1 is greater than or equal to 1010, f.inv.rt returns a #num! error. if degrees_freedom2 is less than 1 or degrees_freedom2 is greater than or equal to 1010, f.inv.rt returns a #num! error. f.inv.rt can be used to return critical values from the f distribution. for example, the output of an anova calculation often includes data for the f statistic, f probability, and f critical value at the 0.05 significance level. to return the critical value of f, you use the significance level as the probability argument to f.inv.rt . f.inv.rt uses an iterative technique for calculating the function. given a probability value, f.inv.rt iterates until the result is accurate to within 3 10–7. if f.inv.rt does not converge after 100 iterations, the function returns an #n/a error. other distributions: exponential, hypergeometric, and weibull a few remaining probability distributions are available in excel: exponential, hypergeometric, and weibull. syntax: expon.dist( x,lambda,cumulative ) note f.inv.rt is the excel 2010 equivalent of finv in legacy versions of excel. the new excel 2010 function of f.inv returns the inverse of the f distribution. note hypgeom.dist in excel 2010 is similar to hypgeomdist in legacy versions of excel. to convert to hypgeomdist, add a 1 as the cumulative argument for hypgeomdist. note expon.dist in excel 2010 is equivalent to the expondist in legacy versions of excel.us ing stat i st i cal funct ions 474 2 part the expon.dist function returns the exponential distribution. you use expon.dist to model the time between events, such as how long a bank’s automated teller machine takes to deliver cash. for example, you can use expon.dist to determine the probability that the process takes, at most, one minute. the expon.dist function takes the following arguments: • x—this is the value of the function. • lambda—this is the parameter value. • cumulative—this is a logical value that indicates which form of the exponential function to provide. if cumulative is true, expon.dist returns the cumulative distribution function; if cumulative is false, expon.dist returns the probability density function. if x or lambda is nonnumeric, expon.dist returns a #value! error. if x is less than 0, expon.dist returns a #num! error. if lambda is less than or equal to 0, expon.dist returns a #num! error. syntax: hypgeom.dist( sample_s,number_sample,population_s,number_population,cumulative ) the hypgeom.dist function returns the hypergeometric distribution. hypgeom.dist returns the prob-ability of a given number of sample successes, given the sample size, population successes, and popu-lation size. you use hypgeom.dist for a problem that has a finite population, where each observation is either a success or a failure, and where each subset of a given size is chosen with equal likelihood. the hypgeom.dist function takes the following arguments: • sample_s —this is the number of successes in the sample. • number_sample—this is the size of the sample. • population_s—this is the number of successes in the population. • number_population—this is the population size. • cumulative—1 or true for cdf. false or 0 for pdf. all arguments are truncated to integers. if any argument is nonnumeric, hypgeom.dist returns a #value! error. if sample_s is less than 0 or sample_s is greater than the lesser of number_sample or population_s, hypgeom.dist returns a #num! error. if sample_s is less than the larger of 0 or ( number_sample – number_population population_s), hypgeom.dist returns a #num! error. if number_sample is less than 0 or number_sample is greater than number_population, hypgeom.dist returns a #num! error. if population_s is less than 0 or population_s is greater than number_population, hypgeom.dist returns a #num! error. if number_population is less than 0, hypgeom.dist returns a #num! error. hypgeom.dist is used in sampling without replace-ment from a finite population. syntax: weibull.dist( x,alpha,beta,cumulative )475 examples of funct ions for inferent ial stat i st i cs 14 chapter the weibull.dist function returns the weibull distribution. you use this distribution in reliability analysis, such as for calculating a device’s mean time to failure. this function takes the following arguments: • x—this is the value at which to evaluate the function. • alpha—this is a parameter to the distribution. • beta—this is a parameter to the distribution. • cumulative—this determines the form of the function. if x, alpha, or beta is nonnumeric, weibull.dist returns a #value! error. if x is less than 0, weibull.dist returns a #num! error. if alpha is less than or equal to 0 or if beta is less than or equal to 0, weibull.dist returns a #num! error. using prob to calculate probability for a population that fits no distribution curve in some cases, you might have a data set that does not appear to follow any standard probability distribution curve. however, you may have sufficient past data to figure the probability of each out-come. in such a case, you can build a table of the possible outcomes and the probability of each out-come. you use the prob function to figure out the chances that a value x will fall between an upper and a lower limit. syntax: prob( x_range , prob_range , lower_limit , upper_limit ) the prob function returns the probability that values in a range are between two limits. if upper_ limit is not supplied, prob returns the probability that values in x_range are equal to lower_ limit. this function takes the following arguments: • x_range—this is the range of numeric values of x with which there are associated probabilities. • prob_range—this is a set of probabilities associated with values in x_range. • lower_limit—this is the lower bound on the value for which you want a probability. • upper_limit—this is the optional upper bound on the value for which you want a probability. if any value in prob_range is less than or equal to 0, or if any value in prob_range is greater than 1, prob returns a #num! error. if the sum of the values in prob_range is greater than 1, prob returns a #num! error. if upper_limit is omitted, prob returns the probability of being equal to lower_limit. if x_range and prob_range contain a different number of data points, prob returns an #n/a error. in figure 14.46, the table in a2:b9 shows the probability of achieving a particular score on a seven-point quiz. the range of possible scores in a2:a9 is used as the first argument. the range of prob-abilities in b2:b9 is used as the second argument. various formulas in column g find the probability of any given test falling between two values. note in excel 2010, weibull.dist is equivalent to weibull in legacy versions of excel.us ing stat i st i cal funct ions 476 2 part using z.test, confidence.norm, and confidence.t to calculate confidence intervals confidence testing is one of the most confusing topics in statistics. suppose that you have a very large population, such as the 500 million people who use microsoft excel. you would like to find out how many minutes per month people use pivot tables. it would be difficult to survey the 500 million people. instead, you find a way to survey 30 people. the mean of those 30 answers is 155 minutes per month. think about the standard deviation of the entire population. there has to be wide variabil-ity because more than half the people using excel never use pivot tables, and their answer would be zero. somehow, you miraculously figure out that the standard deviation of the entire population is 220. you can use the confidence.norm function to ask for the 90% confidence interval about this statis-tic. the formula confidence(0.10,220,30) returns a confidence interval of 66. this means that for any sample of 30 people using excel, the mean of that sample will be within 66 of the true population mean 90% of the time. in figure 14.47, a confidence interval is drawn around the sample mean of 11 samples. the 90% confidence level is saying that in 90% of the samples, the confidence level drawn on the chart will include the true mean of the population. although the data in figure 14.47 is fictitious, the actual mean of that entire population is 78. of the 11 series drawn on the chart, 10 of the 11 happen to encompass the true mean of 78. note, however, that the first sample mean of 156 is the one that does not include the true mean. syntax: confidence.norm( alpha,standard_dev,size ) the confidence.norm function returns the confidence interval for a population mean. the con-fidence interval is a range on either side of a sample mean. for example, if you order a product through the mail, you can determine, with a particular level of confidence, the earliest and latest the product will arrive. this function takes the following arguments: figure 14.46 this may not fall into any known distribution curve, but the prob function can calculate probabili-ties, nonetheless. caution it is tempting to interpret the confidence.norm result to say that 90% of the population is within the error bars. this is wrong. reread the last para-graph: if you use the sample mean plus or minus the confi-dence interval, you will include the true mean 9 out of 10 times.477 examples of funct ions for inferent ial stat i st i cs 14 chapter • alpha—this is the significance level used to compute the confidence level. the confidence level equals 100 (1 – alpha)% or, in other words, an alpha of 0.05 indicates a 95% confidence level. • standard_dev—this is the population standard deviation for the data range and is assumed to be known. • size—this is the sample size. if any argument is nonnumeric, confidence.norm returns a #value! error. if alpha is less than or equal to 0 or alpha greater than or equal to 1, confidence.norm returns a #num! error. if standard_dev is less than or equal to 0, confidence.norm returns a #num! error. if size is not an integer, it is truncated. if size is less than 1, confidence. norm returns a #num! error. using z.test to accept or reject a hypothesis you use the z.test function for hypothesis testing. suppose that i make a claim that you will be more confident using pivot tables after attending one of my power excel seminars. one month after one of my seminars, i randomly select 30 students from the class and ask them how many minutes during the month they used pivot tables. the sample mean comes back at 156 minutes. this mean is higher than most sample means. but is it high enough to be statistically valid? could i have achieved a sample mean of 156 just randomly? figure 14.47 the confidence.norm func-tion does not give me a lot of confidence that i can predict the activities of 500 million people using excel based on a survey of 10 people. note the excel 2010 function z.test is equivalent to the ztest func-tion in legacy versions of excel.us ing stat i st i cal funct ions 478 2 part syntax: z.test( array,x,sigma ) the z.test function returns the two-tailed p value of a z-test. the z-test generates a standard score for x with respect to the data set, array, and returns the two-tailed probability for the normal distribution. you can use this function to assess the like-lihood that a particular observation is drawn from a particular population. this function takes the following arguments: • array—this is the array or range of data against which to test x. • x—this is the value to test. • sigma—this is the population (known) standard deviation. if this argument is omitted, the sample standard deviation is used. if array is empty, z.test returns an #n/a error. using permut to calculate the number of possible arrangements suppose your company has 40 products in its catalog. you must choose four items to be featured in an upcoming skymall issue. the sequence in which the products appear in the ad is relevant. you would like to test the possible ads with a test audience. how many different possible ads could you generate? you use the permut function to solve this problem. syntax: permut( number,number_chosen ) the permut function returns the number of permutations for a given number of objects that can be selected from number objects. a permutation is any set or subset of objects or events in which inter-nal order is significant. permutations are different from combinations, for which the internal order is not significant. you use this function for lottery-style probability calculations. the permut function takes the following arguments: • number—this is an integer that describes the number of objects. • number_chosen—this is an integer that describes the number of objects in each permutation. both arguments are truncated to integers. if number or number_chosen is nonnumeric, permut returns a #value! error. if number is less than or equal to 0 or if number_chosen is less than 0, permut returns a #num! error. if number is less than number_chosen, permut returns a #num! error. caution a slight problem with the con-fidence interval function is that the confidence.norm function expects that you know with certainty the standard devia-tion of the entire population. in real life, if you don’t know the mean of the 500 million people using excel, how would you ever calculate the standard devia-tion? in reality, when you don’t know the population standard deviation, you often substitute the sample standard deviation, but this causes you to have to use the t distribution instead of confidence.norm. microsoft added confidence.t to excel 2010 to handle this situation.479 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter the formula to solve the skymall problem is permut(40,4). the result is that there are 2,193,360 possible permutations of products to appear in a one-page ad in the catalog. that is a lot of possibilities! using the analysis toolpak to perform statistical analysis the functions discussed in this chapter are wonderful for doing statistical analysis. if you can use a function to perform some analysis, the function offers a live result. you can change some assump-tions, and the results automatically update. however, many statisticians instead rely on the data tools available in the analysis toolpak. the analysis toolpak can provide beautiful snapshot-type reports that analyze a data set. although these reports provide more informa-tion than a typical function, they have the downside that they do not automatically recalculate. if you change one of the assumptions in the data set, you will have to rerun the analysis. excel offers many options for performing statistical analysis. using functions in excel provides real-time, live results of the data. on the other hand, some of the tools, such as regression, provide additional statistics that run circles around the equivalent functions in excel. in this case, it would be advantageous to use the analysis toolpak. remember, however, that when you use the data analysis tools from the analysis toolpak, they cre-ate static snapshots of the results. if you change the underlying data, you have to rerun the analysis. installing the analysis toolpak in excel 2010 in legacy versions of excel, many people would install the analysis toolpak because they needed it to enable the 89 functions it contained. when you enabled the analysis toolpak to access the addi-tional functions, excel silently added a new data analysis item to the tools menu. however, in excel 2010, those 89 functions are already part of the core excel product. thus, it is much less likely that you already have the analysis toolpak installed. to install it, follow these steps: 1. select file, options. 2. from the left list, select add-ins. you see a long list of active and inactive add-ins. 3. from the bottom of the window, select the manage drop-down box and then select excel add-ins. click go. you are taken back to the excel 2003 add-ins dialog. 4. in the add-ins dialog, select the analysis toolpak check box. click ok. note the data analysis tools in the analysis toolpak vary greatly. some of them are poorly imple-mented and provide such narrow functionality that it is usually better to use your own functions rather than those tools.us ing stat i st i cal funct ions 480 2 part if this process is successful, you get a new analysis group on the data tab. the group has a single button called data analysis, as shown in figure 14.48. note that this item is rather finicky. you must click data analysis to invoke the data analysis dialog box. figure 14.48 after you successfully install the analysis toolpak, a new group on the data tab offers access to the data analysis dialog box. generating random numbers based on various distributions whereas the rand and randbetween functions generate random numbers, the random number generation choice in the data analysis dialog box allows you to create more sophisticated random number populations. here’s how you use it: 1. make sure the analysis toolpak is installed. 2. from the data tab, select data analysis. 3. scroll down and select random number generation and click ok. the random number generation dialog appears (see figure 14.49). figure 14.49 you can generate random numbers by using the random number generation dialog. 4. in the random number generation dialog, choose the number of columns that you would like to fill with random numbers. if you want three columns of random numbers, enter 3 in the number of variables text box. 5. choose the number of rows that you would like to fill with random numbers. if you want 100 rows of random numbers, fill in 100 in the number of random numbers text box.481 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 6. select one of the seven options in the distribution drop-down. the questions in the parameters frame change for each distribution option: • for a uniform distribution, you choose upper and lower limits in the parameters frame. this functionality is similar to using the rand worksheet function. • for a normal distribution, you choose a mean and standard deviation. this functionality is very cool and is not available through the normal excel functions. • for a bernoulli distribution, you choose a probability of suc-cess on each trial. bernoulli random variables have a value of 0 or 1. if you want to model lebron james’s ability to make free throws, you use a bernoulli distribution with a probability of success of 79.4% • for a binomial distribution, you specify a p value and the number of trials. for example, you can generate number-of-trials bernoulli random variables, the sum of which is a binomial ran-dom variable. • for a poisson distribution, you specify a value, lambda, that is equal to 1 / mean. poisson dis-tributions are often used to characterize the number of events that occur per unit of time (for example, the average rate at which cars arrive at a toll plaza). • for a patterned distribution, you specify five parameters. you specify a lower and upper limit in steps of a certain value. you can also specify that each number repeats n times and that the whole sequence repeats y times. • for a discrete distribution, you specify a range of values and their probabilities. in this case, you might have a list of 40 products in a2:a41 and then their probabilities of being selected in b2:b41. note that the sum of the values in the probability column must add to 100%. 7. in the random seed text box, enter any numeric seed. this concept is a little bizarre. in a com-puter, random numbers are not really random. scientists call them pseudo-random. if you leave the random seed text box, excel uses some strange number (perhaps the number of seconds since 1900 or perhaps the free memory in the stack) as a seed. this ensures that you get differ-ent random numbers every time. however, if you enter your own seed, such as 123, and then come back a month later with the same seed, excel generates exactly the same list of random numbers. 8. for the output range, you can choose an output range, a new worksheet, or a new workbook. for some unknown reason, this dialog box refers to a new worksheet as new worksheet ply. generating a histogram consider a set of 100 observations. if the possible values are from a continuous series, it is likely that you won’t have any two values that are exactly the same. the chart of this data will show a lot of noise, as shown in figure 14.50. note there is nothing random about a patterned distribution method. you are simply creating numbers that follow a certain pattern.us ing stat i st i cal funct ions 482 2 part statisticians instead prefer to group those values into similar categories. perhaps logical categories for this data set are 24–34, 35–44, 45–54, and so on. the technical term for these groups is bins. the histogram tool takes a set of observations and groups them into bins, similarly to the way that the frequency function normally does. however, the histogram function goes further, offering the cumulative percentage of each bin, and then it re-sorts the bins into a pareto analysis. excel also offers to create a chart based on the output. to use the histogram tool, follow these steps: 1. make sure the analysis toolpak is installed. 2. think about some groupings for your data and enter these in a new column in the worksheet. the first bin should be less than the minimum value in your data set. if your bin range contains 25, 35, 45, the first bin will include from 25 up through values just less than 35. 3. from the data tab, select data analysis. then select histogram and click ok. the histogram dia-log appears. 4. in the histogram dialog, specify the range that contains your observations as the input range. this range does not need to be sorted. you may include a one-cell heading as part of the range. if you do, you must also include a one-cell heading for the bin range and also check the labels option in step 6. 5. specify your range from step 2 as the bin range. 6. if your input and bin ranges contain one-cell headings, select the labels check box. 7. for the output, specify the upper-left corner of a blank spot on the current worksheet, or specify a new worksheet or a new workbook. 8. select the pareto check box. excel produces the histogram and then produces a second histo-gram. in the second histogram, the most popular bin is sorted to the top of the list. 9. select the cumulative check box. excel reports the cumulative percentage accounted for by val-ues from the bottom of the list through the current bin. figure 14.50 plotting the individual points of a sample does not tell you a lot about the sample.483 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 10. select the chart output check box to ask for a chart. note that this default chart is fairly plain looking and needs some customization to be acceptable. 11. click ok to create the histogram. figure 14.51 shows the histogram dialog box, along with the results of the histogram. figure 14.51 using input area of column a and the bins in column c, excel produces a histogram in e:j. this is significantly easier than using the frequency array formula. generating descriptive statistics of a population excel provides a large number of functions to describe data sets. earlier in this chapter, you learned about functions to calculate the mean, median, mode, skew, and so on of your data. by using the data analysis tools, you can generate all these statistics in a single command. to do so, follow these steps: 1. make sure the analysis toolpak is installed. 2. from the data tab, select data analysis. then select descriptive statistics and click ok. the descriptive statistics dialog appears. 3. in the descriptive statistics dialog, choose the input range for your data set. 4. if the range in step 3 contains a heading in the first row, select the labels check box. 5. set the output as a new range, a new worksheet, or a new workbook.us ing stat i st i cal funct ions 484 2 part 6. select summary statistics. excel provides values for mean, standard error (of the mean), median, mode, standard deviation, variance, kurtosis, skewness, range, minimum, maximum, sum, count, largest (#), smallest (#), and confidence level. 7. select the confidence level for mean check box and specify the confidence level you want to use. for example, a confidence level of 95% calculates the confidence level of the mean at a sig-nificance of 5%. 8. if you would like row(s) in the output for the kth largest and/or smallest values, select the appro-priate check boxes and fill in the value for k. for example, if you ask for the kth largest with a value of 3, excel report the third-largest value in the data set. 9. click ok. results similar to those shown in figure 14.52 are generated. figure 14.52 excel can gen-erate every descriptive sta-tistic for a data set with a sin-gle command. the output range in c3:d20 is generated from the dialog box shown. ranking results the excel rank function has an inherent problem when two results in the data set are tied. whereas the rank.avg function provides a workaround for this problem, the rank and percentile feature can-not overcome this limitation. if you are worried about the possibility of a tie in your data set, you should use the rank.avg function instead of this command. to assign a rank and percentage to a data set, follow these steps: 1. make sure the analysis toolpak is installed. then scroll down and select rank and percentile and click ok. the rank and percentile dialog appears. 2. from the data tab, select data analysis. 3. in the rank and percentile dialog, choose the input range for your data set. the input range may contain a single-cell heading at the top of the data, but it may not contain any other nonnumeric data. in figure 14.53, it would be nice if excel could accept the names associated with each data point, but it cannot. you have to add them back later.485 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 4. if your input range has a heading in the first row, select the labels in first row check box. 5. choose an output range for the data set. excel returns the statistics shown in d1:g16 in figure 14.53. notice that the scores have been sorted in high-to-low sequence. in column d, excel refers to each cell as being at point 1, point 2, point 3, and so on. in figure 14.53, column h was added after the fact, using the formula index(a2:a16,d2). cell d2 contains the point number for this row. basically, this function asks for the third value in a2:a16. notice that carla and jessica are in a tie for second place. no one in this data set is ranked third because of this tie. if you used the rank function as described earlier in this chapter, you could break the ties by using a countif function. figure 14.53 the rank and percen-tile function will sort the data and calculate a rank and a percentile function. it cannot resolve ties, however. using regression to predict future results the regression tool available in the analysis toolpak runs circles around the linest function in excel. as described previously, linest returns a bizarre unlabeled set of results for a regression. the regression tool, on the other hand, provides a myriad of well-labeled statistics, analysis, and charts as the output. to perform a regression analysis using the regression tool, you follow these steps: 1. make sure the analysis toolpak is installed. 2. ensure that your data includes one independent variable, such as sales per day. it can also con-tain one or more dependent variables—items that might explain the variability in sales. (in this example, dependent variables include outside temperature, if it rained, and if it was a weekend.) 3. from the data tab, select data analysis. then scroll down, select regression, and click ok. the regression dialog appears.us ing stat i st i cal funct ions 486 2 part 4. in the regression dialog, the input y range must be a single column of data. in this example, it is the range containing sales for each day. be sure to include a cell at the top of the column that describes the data. 5. in the input x range text box, use a range that is the same height as the y range. the x range can contain one column for each independent variable. in this example, the x range contains columns for temperature, rain, and weekend. for best results, include a cell at the top of each column, with the name of the variable. 6. if your ranges in steps 4 and 5 include headings, select the labels check box. 7. if you want to force the y-intercept to be 0, select the constant is zero check box. 8. the confidence level box is interesting. the program always gives statistics for a 95% confi-dence level. if you enter a different percentage in this box, you get two confidence levels: one for the default 95%, and one for the other value you enter. 9. specify the output range as the top-left cell of a range. in this example, the regression output occupies from g2 to o119, so make sure that you have a really large area set aside for the results. 10. fill in the remaining options in the regression dialog to add sections to the report: • residuals—select this to include residuals in the residuals output table. • standardized residuals—select this to include standardized residuals in the residuals output table. • residual plots —select this to generate a chart for each independent variable versus the residual. • line fit plots—select this to generate a chart for predicted values versus the observed values. • normal probability plots—select this to generate a chart that plots normal probability. when you are done, the dialog box should look roughly as shown in figure 14.54. figure 14.54 the hardest part of specify-ing a regression is remem-bering that the y range is the value you are trying to predict.487 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter after you run the regression, excel provides the following sections of the report (see figure 14.55): • regression statistics such as r-squared are provided in the top section. • an anova analysis is provided. • the actual regression results are provided in column 2 of the third section. in this example, the prediction for sales comes from h18:h21. the formula would be that sales for any day will be –75 2.6 high temperature 52 if it is a weekend. if it is raining, you subtract 102 from this prediction. remaining columns in this section return the standard error, t statistic, p value, and confidence limits for each variable. • the next section goes far beyond the linest function. excel uses the regression results to pre-dict sales for each day in the data set. the predicted sales are in column 2 of the data set which is column h in the figure. the comparison of predicted sales to actual sales is shown in the residuals column. • finally, excel provides a probability table. the table explains that on the worst 12.5% of days, you might sell 44 or less. figure 14.55 the regres-sion report from the analysis toolpak is fantastic. it provides a more com-prehensive view than the linest function. using a moving average to forecast sales the moving average command in the data analysis tools is disappointing. the technique of using a moving average to produce future forecasts is based on the concept that variability in the month-to-month actuals is lessened if you always average three months. after choosing data analysis, moving average, you can specify an input range that contains one column of sales data. the interval value of 3 produces a three-month moving average. after you use the moving average command, excel adds one column with a series of simple average() formulas. each formula averages the sales from the previous month, this month, andus ing stat i st i cal funct ions 488 2 part the next month. in theory, you would then use this column as input to the forecasting methods to produce a future forecast. in figure 14.56, column c is the new moving average column. column d is the standard error col-umn. this command is really a lot of hassle when you could easily add your own average formula in column c. figure 14.56 the moving average fea-ture of the data analysis tools is a long route to adding a simple formula. using exponential smoothing to forecast sales the exponential smoothing feature in the data analysis tools allows you to set up a forecasting for-mula that uses exponential smoothing. this method of forecasting requires only two points: the forecast for the previous month and the actual for the current month. the forecast for the next month is created by adding together 75% of the most recent actuals and 25% of the prior forecast. in this example, the 25% is called a damping factor. you can assign any damping factor that you want, but values in the 20% to 30% range are recommended. to set up an exponential smoothing forecast, follow these steps: 1. make sure the analysis toolpak is installed. 2. ensure that your data includes one column of sales data, such as sales per month.489 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 3. from the data tab, select data analysis. then select exponential smoothing and click ok. the exponential smoothing dialog appears. 4. in the exponential smoothing dialog, the input range should be your single column of sales data. if you include a heading cell, select the labels check box. 5. ensure that the damping factor is between 0.20 and 0.30. with a damping factor of 0.30, the cur-rent forecast is based 70% on the most recent actuals and 30% on all the past forecasts. 6. limit the output range to a cell on the current worksheet. ideally, this range starts in the same row as your input range, in an adjacent column. 7. to create a chart comparing forecast and actuals, select the chart output check box. 8. select the standard errors check box. the output contains a second column with a standard error calculation. this calculation analyzes the current period and last three periods. in row 5, enter the standard error formula sqrt(sumxmy2(b3:b5,c2:c4)/3). this formula subtracts the fore-cast from the actual for the last three months, squares the differences, adds them, divides to find an average, and then takes the square root of the average. 9. click ok to produce the analysis. figure 14.57 shows the exponential smoothing dialog and the subsequent results of the analysis. figure 14.57 exponential smoothing pro-vides a forecast that is heav-ily weighted toward recent actuals.us ing stat i st i cal funct ions 490 2 part because the standard error column must analyze four months of forecasts and actuals, the first three data points in the standard error column are always #n/a !. using correlation or covariance to calculate the relationship between many variables both covariance and correlation are measures of the extent to which two measurement variables vary together. i prefer the correlation coefficient because it is independent of the units involved. say that you are comparing height in inches or centimeters to weight in pounds or kilograms. the correlation coefficient returns a value from 1 to –1. correlation coefficient values close to 0 indicate little or no correlation between the measures. a value close to 1 indicates a strong positive correla-tion: as one variable increases, the other is likely to increase. a value close to –1 indicates a strong negative correlation: the value of one variable is likely to decrease as the value of the other variable increases. you could calculate these values manually by using the correl or pearson functions in excel, but the data analysis version is particularly well suited to data sets that have many measurements for each member of a population. in this case, the correlation tool generates a correlation coefficient for every possible combination of the measurement statistics. figure 14.58 shows a database of body statistics for a sample of 125 people. for each person, the cli-nician measured 13 key measurements, such as height, weight, and so on. it would be interesting to see if height is a good predictor of weight or if some other measurement is appropriate. note a bug prevents excel from enter-ing the label sales forecast in the top row of the output. you have to manually change the headings excel generates. figure 14.58 in a collection of key measurement stats for 125 members of a population, which measurements are most related? to build a matrix of correlation coefficients (or covariances), you follow these steps: 1. make sure the analysis toolpak is installed. 2. ensure that your data includes several columns of measurements for a population. each row should represent another member of the population. try to avoid missing values. if one measure-ment is missing for a population member, that member is thrown out of the entire calculation.491 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 3. from the data tab, select data analysis. then select correlation (or covariance) and click ok. the correlation dialog appears. 4. in the correlation dialog, ensure that the input range includes your row of headings and all the measurements. if you have an id field, do not include it in the input range. 5. if your data has labels as the first row or column of the input range, select the labels check box. 6. select the upper-left corner of the output range. if your input range has n columns, the size of the output range will be ( n 1) ( n 1). 7. click ok to create the correlation matrix. figure 14.59 shows the correlation dialog box and the resulting correlation matrix. in this particular example, height and weight have a weak correlation coefficient of 0.21. you can compare this to the correlation coefficient for hip and weight, which has a positive correlation of 0.93. the covariance feature works the same as the correlation feature, except the output table is not scaled to provide answers between –1 and 1. figure 14.59 the correla-tion coef-ficient matrix produces results from –1 to 1. values further away from 0 indi-cate a strong correlation between the measurement variables. using sampling to create random samples earlier in this chapter, in the section on the rand function, you learned about a way to collect a ran-dom sample. you can also allow the data analysis tools to produce a random sample for you. the random sampling feature offers two interesting ways to collect a sample. excel can either randomly select n members of the population, or you can specify that excel should select every kth member of the population.us ing stat i st i cal funct ions 492 2 part you follow these steps to select a random sample: 1. make sure the analysis toolpak is installed. 2. ensure that your data is completely numeric. this feature works best on a single column of data, so ensure that you are selecting just a single column. if you have multiple columns of data, excel randomly selects cells from the entire range; for example, the random sample might include cells b2, a5, c7, d10, b2. ensure that you do not include column headings if your data spans multiple columns. 3. from the data tab, select data analysis. then scroll down, select sampling, and click ok. the sampling dialog appears. 4. in the sampling dialog, ensure that the input range includes your data range. if your data includes a single column, and you have headings in the first cell, select the labels check box. 5. for random sampling, ask for a specific number of samples. the other option is to specify peri-odic sampling, which provides every nth value in the data set. 6. specify the top-left cell of the output range and click ok. in figure 14.60, excel has produced a random sample of 10 from a rectangular range of data. tip in step 4, do not include labels if your population spans multiple columns. figure 14.60 a random sample from the sampling dialog might include duplicates.493 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter the random sampling feature allows for duplicates within the same sample. if you need to make sure that any given sample contains no duplicates, you should use the rand function instead. if you ask for a periodic sample, excel traverses each column from left to right. selecting every fourth value from g2:k10 in figure 14.60 would select 4, 8 from the first column, and then 30, 70 from the second column. from 70, excel would skip the next three values of 80, 90, and 100, and it would return 200 as the next periodic member of the sample. using anova to perform analysis of variance testing anova stands for analysis of variance. the data analysis tools offer three forms of anova testing: • single-factor anova—this is for measuring variance for two or more samples with a single variable. for example, suppose that you have 18 farm fields. all are planted with the same vari-ety of wheat. six are treated with nutrient a, six are treated with nutrient b, and six are treated with nutrient c. single-variable anova would analyze whether the variances in the populations were random or due to the fertilizers. • two-factor anova without replication—this is for use when your data can be classified along two different dimensions. for example, suppose that half of the farm fields are downwind from an interstate highway that is heavily traveled by diesel trucks. you could analyze the variance caused by the fertilizer versus the variance caused by the carbon monoxide from the highway. • two-factor anova with replication—if you have enough samples so that every combination of {fertilizer, highway} has multiple samples, you can perform two-factor anova with replication. otherwise, you use two-factor anova without replication. follow these steps to perform a one-way anova test: 1. if your data is set up as records with data for each field, arrange the data in columns for each variable. in figure 14.61, this means taking the data from column b and arranging it in three col-umns, e, f, and g, with a heading above each column. 2. choose a null hypothesis. for example, your null hypothesis might be that all the nutrients pro-duce a similar mean. if you can reject the null hypothesis, then your hypothesis is that the selec-tion nutrient has an impact on yield. 3. choose a significance level, alpha, of 0.05. if the statistics from the anova output show a p value greater than the alpha, you can reject the null hypothesis and assume that the nutrient has an impact on yield. 4. make sure the analysis toolpak is installed. 5. from the data tab, select data analysis. then select anova: single factor and click ok. the anova: single factor dialog appears. 6. in the anova: single factor dialog, ensure that the input range includes your columns of means.us ing stat i st i cal funct ions 494 2 part 7. if your input range includes a heading above each column, select the labels in first row check box. 8. in the alpha box, enter the level at which you want to evaluate critical values for the f statistic. the alpha level is a significance level related to the probability of having a type i error (that is, rejecting a true hypothesis). 9. select the top-left corner for the output range. 10. click ok to produce the result. in figure 14.61, the important statistic is the p value in cell j23. because this number is larger than alpha, you can reject the null hypothesis and assume that the nutrients had an impact on the yield. follow these steps to perform a two-way anova test with replication: 1. arrange your data so that one dimension is spread across the columns. (this can be tricky.) 2. ensure that you have equal numbers of samples along the second dimension. in figure 14.62, there were three rows of yields from fields downwind from a highway. these rows must be arranged together. for convenience, have a row label in cell g7 to identify this block of data. 3. because you had three rows of sample yields for fields adja-cent to highways, you also have to find three rows of sample yields for fields away from highways. this block of three rows must immediately follow the other data. 4. make sure the analysis toolpak is installed. 5. from the data tab, select data analysis. then select anova: two-factor with replication and click ok. the anova: two-factor with replication dialog appears. figure 14.61 the dif-ference in the sample means is statistically significant. tip again, for convenience, in step 3 make sure there is a heading in the first column and first row of this block to identify the value along the second dimension.495 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 6. in the anova: two-factor with replication dialog, ensure that the input range includes sample values as well as an additional row above to identify the first-dimension variables and an addi-tional column to the left to identify the second-dimension variable. 7. in the rows per sample text box, enter the number of rows in each block of data. in this present example, there are three rows of yields for highway fields and three rows for nonhighway fields, so enter 3. 8. in the alpha box, enter the level at which you want to evaluate critical values for the f statistic. the alpha level is a significance level related to the probability of having a type i error (that is, rejecting a true hypothesis). 9. for the output range, select the top-left corner of a large blank area. the anova results will take up 30 rows by 7 columns. 10. click ok to perform the analysis. evaluating the results in the results from this analysis, watch for the values in italic in the first column of the output range. the first block of data in the output range describes the first block of three rows in the input range, with a value of “yes” to the highway question. the final block of the analysis shows the p values for each dimension and the two dimensions com-bined. in this particular analysis, it appears that much of the variability is due to highway proximity and does not necessarily have that much to do with the nutrients. the p value of 0.047 for the columns is not enough to reject the null hypothesis that the variability due to nutrients could be random. figure 14.62 setting up the input range in equal size rows is the key to success-ful use of two-factor anova analysis.us ing stat i st i cal funct ions 496 2 part in some cases, you may have two factors for the anova testing, but you may not have multiple samples for every combination of {dimension1, dimension2}. in this case, you can run two-factor anova testing without replication. the results from this test contain less analysis than do the results from the test with replication. in this test, excel does not predict if factors beyond the two dimensions are causing variability. to perform a two-factor anova without replication, follow these steps: 1. arrange your data in a crosstab fashion. have values from dimension 1 going across the top row of the data. have values from dimension 2 going down the left column of the data. enter the sample value in each intersection. 2. make sure the analysis toolpak is installed. 3. from the data tab, select data analysis. then select anova: two-factor without replication and click ok. the anova: two-factor without replication dialog appears. 4. in the anova: two-factor without replication dialog, ensure that the input range includes sample values, as well as an additional row above to identify the first-dimension variables and an additional column to the left to identify the second-dimension variables. 5. select the labels check box so excel can get the headings for the dimension 1 and dimension 2 values from the worksheet. 6. in the alpha box, enter the level at which you want to evaluate critical values for the f statistic. the alpha level is a significance level related to the probability of having a type i error (that is, rejecting a true hypothesis). 7. click ok to run the analysis. excel analyzes the variance based on the rows and columns, as shown in figure 14.63. figure 14.63 in this particular sample, the column drives vari-ability more than the rows.497 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter using the f-test to measure variability between methods if you want to compare two methods, it is helpful to know if the variances in the two methods are roughly the same. the f-test was designed by statistician r. a. fisher. (the f here stands for fisher and nothing intuitive.) the f-test compares two variances, v1 / v2, to produce an f statistic. values close to 1 indicate that the variances are similar. to run an f-test, follow these steps: 1. set up two ranges with samples from each population. these samples do not have to have the same number of members. 2. make sure the analysis toolpak is installed. 3. from the data tab, select data analysis. then select f-test two-sample for variances. the f-test two sample for variances dialog appears. 4. in the f-test two sample for variances dialog, choose the range for both of your sample ranges. 5. in the alpha box, enter the level at which you want to evaluate critical values for the f statistic. the alpha level is a significance level related to the probability of having a type i error (that is, rejecting a true hypothesis). 6. select the top-left cell of an output range. 7. click ok to produce the analysis. the f-test tool provides the result of a test of the null hypothesis that these two samples come from distributions with equal variances against the alternative that the variances are not equal in the underlying distributions. the f-test tool calculates the value of an f statistic. a value of f close to 1 provides evidence that the underlying population variances are equal. there is a tricky element to the output table. if the f value is less than 1, you need to look to the next row, which has the label “p(f f) one-tail.” it gives the probability of observing a value of the f statistic less than f when population variances are equal. the next row, labeled “f critical one-tail,” gives the critical value less than 1 for the chosen significance level, alpha. if the f statistic is greater than 1, the meanings of these rows are reversed. the row labeled “p(f f) one-tail” gives the probability of observing a value of the f statistic greater than f when popula-tion variances are equal, and “f critical one-tail” gives the critical value greater than 1 for alpha. in figure 14.64, the f statistic of 0.88 is less than 1. this means that the null hypothesis is that the variances are unequal. the f critical value is 0.35, meaning that you can reject the null hypothesis.us ing stat i st i cal funct ions 498 2 part performing a z-test to determine whether two samples have equal means you use the z-test tool to test the null hypothesis that there is no difference between two population means against either one-sided or two-sided alternative hypotheses. z-tests are appropri-ate when the sample sizes are greater than 30. for sample sizes smaller than 30, you use t-tests, as described in the following section. to run a z-test, follow these steps: 1. set up two ranges with data from each sample. calculate the standard deviation of each population. 2. make sure the analysis toolpak is installed. 3. from the data tab, select data analysis. then scroll down and select z-test: two-sample for means. the z-test: two sample for means dialog appears. 4. for variable 1 range, select the range of data for your first sample. 5. for variable 2 range, select the range of data for your second sample. 6. for hypothesized mean difference, if you have a reason to believe that there is a shift from one population to the other caused by an external event, note it here. for example, if you measured figure 14.64 the f-test indicates whether two populations have an equal variance. tip in step 4, if you choose a head-ing cell in this range, be sure to also choose a heading cell in step 5. note if variances are not known, the worksheet function z.test should be used instead.499 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter the height of every kid in the classroom, and the next day you measured the height of every kid while they were standing on a 6-inch bench, the 6 inches would be an explainable shift in the means. 7. for the variances, enter the standard deviations for both populations. as mentioned previously, if you don’t know these, you should use the z.test worksheet function instead of this tool. 8. in the alpha box, enter the confidence level for the test. this value must be in the range 0...1. the alpha level is a significance level related to the probability of having a type i error (that is, rejecting a true hypothesis). 9. select the top-left cell of an output range. 10. click ok to produce the analysis. the results of a z-test are shown in figure 14.65. when analyzing the results, you should be careful to understand the output: • “p(z z) one-tail” is really p(z abs(z)), the probability of a z value further from 0 in the same direction as the observed z value when there is no difference between the population means. • “p(z z) two-tail” is really p(z abs(z) or z -abs(z)), the probability of a z value further from 0 in either direction than the observed z value when there is no difference between the population means. the two-tailed result is just the one-tailed result multiplied by 2. performing student’s t-testing to test population means the two-sample t-test tool tests for equality of the population means underlying each sample. there are three varieties of this test, based on assumptions: • t-test: paired two sample for means—if the two samples came from the same population, one before a treatment and one after the treatment, you use this test. • t-test: two-sample assuming equal variances—if you believe that the variances of each popu-lation are equal, you use this test. • t-test: two sample assuming unequal variances—if you believe that the variances of the two populations are unequal, you use this test. all three varieties produce a t statistic. the t statistic can be negative or nonnegative. under the assumption of equal underlying population means, if t is less than 0, “p(t t) one-tail” gives the probability that a value of the t statistic would be observed that is more negative than t. if t is greater than or equal to 0, “p(t t) one-tail” gives the probability that a value of the t statistic would be observed that is more positive than t. “t critical one-tail” gives the cutoff value so that the probability of observing a value of the t statistic greater than or equal to “t critical one-tail” is alpha.us ing stat i st i cal funct ions 500 2 part “p(t t) two-tail” gives the probability that a value of the t statistic would be observed that is larger in absolute value than t. “p critical two-tail” gives the cutoff value so that the probability of an observed t statistic larger in absolute value than “p critical two-tail” is alpha. to perform a t-test, follow these steps: 1. set up two ranges with data from each sample. 2. make sure the analysis toolpak is installed. 3. from the data tab, select data analysis. then scroll down and select t-test: two-sample assuming equal variance. the t-test dialog appears. 4. for variable 1 range, select the range of data for your first sample. 5. for variable 2 range, select the range of data for your second sample. 6. for hypothesized mean difference, if you have a reason to believe that there is a shift from one population to the other caused by an external event, note it here. 7. in the alpha box, enter the confidence level for the test. this value must be in the range 0...1. the alpha level is a significance level related to the probability of having a type i error (that is, rejecting a true hypothesis). figure 14.65 this z-test indicates that the samples came from different populations.501 us ing the analys i s toolpak to per form stat i st i cal analys i s 14 chapter 8. select the top-left cell of an output range. 9. click ok to produce the analysis. the results of a t-test are shown in figure 14.66. figure 14.66 based on a t statistic close to 0, you can-not assume that these came from differ-ent populations. using functions versus the analysis toolpak tools excel offers many options for performing statistical analysis. using functions in excel provides real-time, live results of the data. the data analysis tools in the analysis toolpak vary greatly. some of them are poorly implemented and provide such narrow functionality that it is usually better to use your own functions rather than those tools. on the other hand, some of the tools, such as regression, provide additional statistics that run circles around the equivalent functions in excel. in this case, it would be advantageous to use the analysis toolpak. remember, however, that when you use the data analysis tools from the analysis toolpak, they create static snapshots of the results. if you change the underlying data, you have to rerun the analysis. table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continued table 14.1 continuedthis page intentionally left blank15 using trig, matrix, and engineering functions scientists, mathematicians, and engineers, as well as high school mathematics students, will get the broadest use out of the functions in this chapter. even though many of the trigonometry functions might seem intimidating, this chap-ter includes practical household examples for many of the functions. anyone who has to lean a ladder against a house can find a use for the trig functions. the imaginary number functions might be useful only to electrical engineers, but any business analyst can make use of the techniques for solving linear equa-tions. table 15.1 provides an alphabetical list of all of excel 2010’s trig functions. detailed examples of the functions are provided later in the chapter. table 15.1 alphabetical list of trig functions function description acos( number ) returns the arccosine of a number. the arccosine is the angle whose cosine is number. the returned angle is given in radians in the range 0 to π. acosh( number ) returns the inverse hyperbolic cosine of a number. number must be greater than or equal to 1. the inverse hyperbolic cosine is the value whose hyperbolic cosine is number, so acosh(cosh(number)) equals number. asin(number) returns the arcsine of a number. the arcsine is the angle whose sine is number. the returned angle is given in radians in the range –π / 2 to π/ 2.us ing tr ig, mat r ix, and engineer ing funct ions 504 2 part function description asinh( number ) returns the inverse hyperbolic sine of a number. the inverse hyperbolic sine is the value whose hyperbolic sine is number, so asinh(sinh(number)) equals number . atan( number ) returns the arctangent of a number. the arctangent is the angle whose tangent is number . the returned angle is given in radians in the range –π/2 to π/2. atan2( x_num , y_num ) returns the arctangent of the specified x- and y-coordinates. the arc-tangent is the angle from the x-axis to a line containing the origin (0, 0) and a point with coordinates ( x_num, y_num ). the angle is given in radians between –π and ϖ, excluding –π. atanh( number ) returns the inverse hyperbolic tangent of a number. number must be between –1 and 1 (excluding –1 and 1). the inverse hyper-bolic tangent is the value whose hyperbolic tangent is number, so atanh(tanh( number )) equals number . cos( number ) returns the cosine of the given angle. cosh( number ) returns the hyperbolic cosine of a number. degrees( angle ) converts radians into degrees. ln( number ) returns the natural logarithm of number. natural logarithms are based on the constant e such as 2.71828182845904). log( number , base ) returns the logarithm of number to the specified base. log10( number ) returns the base-10 logarithm of number . radians( angle ) converts degrees to radians. sin( number ) returns the sine of the given angle. sinh( number ) returns the hyperbolic sine of number . tan( number ) returns the tangent of the given angle. tanh( number ) returns the hyperbolic tangent of number . table 15.2 provides an alphabetical list of all excel 2010’s matrix functions. detailed examples of the functions are provided later in the chapter. table 15.2 alphabetical list of matrix functions function description mdeterm( array ) returns the matrix determinant of an array. minverse( array ) returns the inverse matrix for the matrix stored in an array.505 15 chapter function description mmult( array1,array2 ) returns the matrix product of two arrays. the result is an array with the same number of rows as array1 and the same number of columns as array2. seriessum( x,n,m, coefficients ) returns the sum of a power series based on the formula series(x,n,m,a) a 1 x n a 2 x (nm) a 3 x (n2m) ... a i x (n(i-1)m) sumproduct( array1,array2, array3,... ) multiplies corresponding components in the given arrays and returns the sum of those products. table 15.3 provides an alphabetical list of all excel 2010’s engineering functions. detailed examples of the functions are provided later in the chapter. table 15.3 alphabetical list of engineering functions function description besseli( x,n ) returns the modified bessel function, which is equivalent to the besselj function evaluated for purely imaginary argu-ments. besselj( x,n ) returns the bessel function of the first kind. besselk( x,n ) returns the modified bessel function of the second kind, which is equivalent to the bessely functions evaluated for purely imaginary arguments. bessely( x,n ) returns the bessel function of the second kind. this is the most commonly used form of the bessel functions. this function provides solutions of the bessel differential equation and are infinite at x0. this function is sometimes called the neumann function. bin2dec( number ) converts a binary number to decimal. bin2hex( number,places ) converts a binary number to hexadecimal. bin2oct( number,places ) converts a binary number to octal. complex( real_num,i_ num,suffix ) converts real and imaginary coefficients into a complex number in the form x yi or x yj. convert( number,from_ unit,to_unit ) converts a number from one measurement system to another. for example, convert can translate a table of distances in miles to a table of distances in kilometers. dec2bin( number,places ) converts a decimal number to binary. dec2hex( number,places ) converts a decimal number to hexadecimal.us ing tr ig, mat r ix, and engineer ing funct ions 506 2 part function description dec2oct( number,places) converts a decimal number to octal. delta( number1,number2) tests whether two values are equal. returns 1 if number1 number2; returns 0 otherwise. you use this function to filter a set of values. for example, by summing several delta func-tions, you can calculate the count of equal pairs. this function is also known as the kronecker delta function. erf(lower_limit, upper_limit) returns the error function integrated between lower_limit and upper_limit. erfc(x) returns the complementary erf function integrated between x and infinity. gestep(number,step) returns 1 if number is greater than or equal to step ; otherwise returns 0. you use this function to filter a set of values. for example, by summing several gestep functions, you can calcu-late the count of values that exceed a threshold. hex2bin(number,places) converts a hexadecimal number to binary. hex2dec( number ) converts a hexadecimal number to decimal. hex2oct(number,places) converts a hexadecimal number to octal. imabs( inumber ) returns the absolute value (modulus) of a complex number in x yi or x yj text format. imaginary( inumber ) returns the imaginary coefficient of a complex number in x yi or x yj text format. imargument( inumber ) returns the argument (theta), an angle expressed in radians. imconjugate( inumber ) returns the complex conjugate of a complex number in x yi or x yj text format. imcos(inumber) returns the cosine of a complex number in x yi or x yj text format. imdiv(inumber1,inumber2) returns the quotient of two complex numbers in x yi or x yj text format. imexp( inumber ) returns the exponential of a complex number in x yi or x yj text format. imln( inumber ) returns the natural logarithm of a complex number in x yi or x yj text format. imlog10( inumber ) returns the common logarithm (base-10) of a complex number in x yi or x yj text format. imlog2( inumber ) returns the base-2 logarithm of a complex number in x yi or x yj text format.507 a br ief review of tr igonomet ry bas i cs 15 chapter function description impower(inumber,number) returns a complex number in x yi or x yj text format raised to a power. improduct( inumber1, inumber2,...) returns the product of 2 to 255 complex numbers in x yi or x yj text format. imreal( inumber ) returns the real coefficient of a complex number in x yi or x yj text format. imsin( inumber ) returns the sine of a complex number in x yi or x yj text format. imsqrt( inumber ) returns the square root of a complex number in x yi or x yj text format. imsub(inumber1,inumber2) returns the difference of two complex numbers in x i or x yj text format. imsum(inumber1, inumber2,...) returns the sum of two or more complex numbers in x yi or x yj text format. oct2bin(number,places) converts an octal number to binary. oct2dec( number ) converts an octal number to decimal. oct2hex(number,places ) converts an octal number to hexadecimal. a brief review of trigonometry basics there are numerous real-life situations for which trigonometry can be used. in case trigonometry is just a distant nightmare for you, the following sections review some of the basics. radians versus degrees nonmathemeticians discuss angles in terms of degrees. most corners of a room are at a 90-degree angle. mathematicians discuss angles in a different measurement called radians. although a circle is composed of 360 degrees, it is also composed of about 6.28 radians. each radian is equal to about 57.3 degrees. the exact relationship of degrees to radians requires you to use the mathematical constant pi (π), which is about 3.14159. there are two π radians in a circle. because the trig functions were written with mathematicians in mind, they always expect the argu-ments to be expressed in radians. the formula to convert degrees to radians is to multiply the degrees by pi() and divide by 180. to use this method, you need to write formulas as shown in cell c16 of figure 15.1. fortunately, excel provides the functions radians and degrees to convert from one measurement to another.us ing tr ig, mat r ix, and engineer ing funct ions 508 2 part syntax: degrees( angle ) the degrees function converts radians into degrees. the argument angle is the angle, in radians, that you want to convert. syntax: radians(angle) the radians function converts degrees to radians. the argument angle is an angle, in degrees, that you want to convert. in figure 15.1, b2:b9 converts degrees to radians. the range b12:b14 converts radians back to degrees. the formulas in rows 16 and 17 contrast using pi() / 180 with the radians function. pythagoras and right triangles trigonometry relies on triangles. figure 15.2 shows a right triangle, which is a triangle that has one 90-degree angle. in a right triangle, the side opposite the right angle is known as the hypotenuse. in a right triangle, the square of the hypotenuse is equal to the sum of the squares of the two other sides. this is frequently expressed as c2 a2 b2. figure 15.1 the trig functions in excel expect degrees to be in radians. these two functions convert back and forth from radians to degrees.509 a br ief review of tr igonomet ry bas i cs 15 chapter if you know that the two shorter legs of a right triangle measure 3 feet and 4 feet, then you know the following: c2 32 42 c2 9 16 c2 25 c sqrt(25) c 5 although this formula was discovered a thousand years before pythagoras was alive, he certainly popularized this formula, which is known as the pythagorean theorem. one side plus one angle trigonometry there are three classic functions in trigonometry: sine, cosine, and tangent. these functions describe the ratio of two sides of a triangle when you know the angles of the triangle. consider figure 15.3. one angle is a right angle, which is 90 degrees. if you can figure out one of the other angles and the length of one leg of the triangle, you can figure out the length of all three sides of the triangle by using excel. in figure 15.3, one angle is marked (theta). the side across from is known as the opposite side. the side that is not the hypotenuse and is part of the angle is the adjacent side. three classic functions describe the ratio of any two sides: figure 15.2 the pythagorean theorem allows you to figure out the length of one leg of a right triangle if you know the length of the other two legs.us ing tr ig, mat r ix, and engineer ing funct ions 510 2 part table 15.4 guide to trig functions sin() opposite / hypotenuse cos() adjacent / hypotenuse tan() opposite / adjacent excel offers three trig functions that allow you to find various angles or lengths of a right triangle when you know various combinations of the other angles and/or sides. the examples in this section provide some real-world examples of using trigonometry. using tan to find the height of a tall building from the ground suppose you want to measure the height of a tall building from the ground. the tangent function can find the height of a right triangle if you know the length of the base and the angle to the top of the triangle. to calculate the height of a building, follow these steps: 1. starting from the building, measure out 35 feet along level ground. sight to the top of the build-ing and determine the angle from that point on the ground to the top of the building such as 69 degrees. the 35-feet figure is the length of the adjacent side of the triangle. you want to solve for the opposite side of the triangle. the tan function describes the ratio of the opposite side to the adjacent side. 2. in a cell in excel, enter tan(radians(69)). this tells you that the ratio of the height of the building to the 35 feet is 2.605. 3. because 2.605 opposite / adjacent, plug in 35 for the adjacent side, to get 2.605 opposite / 35. 4. to solve this equation, multiply both sides by 35. the answer, as shown in cell e8 in figure 15.4, is that the building is more than 91 feet tall. figure 15.3 if you know one angle and the length of one side of a right triangle, you can calculate all the sides of the triangle by using trigonometry.511 a br ief review of tr igonomet ry bas i cs 15 chapter syntax: tan( number ) the tan function returns the tangent of the given angle. the argument number is the angle, in radi-ans, for which you want the tangent. if your argument is in degrees, you convert it to radians by using radians(degrees) or multiply it by pi() / 180. using sin to find the height of a kite in a tree suppose your children are flying a kite. they have let out all 150 feet of string. the kite is caught at the top of a faraway tree, as shown in figure 15.5. to find the height of this tree, follow these steps: 1. sight the angle from the end of the string to the top of the tree. it measures 29 degrees. 2. refer to table 15.4 earlier in this chapter. because you know the hypotenuse and want to find the opposite side, use the sin function. 3. in a cell in excel, enter sin(radians(29)). the result is 0.484. 4. because the sine is the ratio of the opposite side to the hypotenuse, create the formula 0.484 opposite / 150. 5. to solve for the opposite side, multiply both sides of the equation by 150. you find that the tree is over 72 feet tall. figure 15.4 you can use the tan function to find the height of this build-ing.us ing tr ig, mat r ix, and engineer ing funct ions 512 2 part 6. assess your tree-climbing skills. if you do not currently work for davey tree experts, perhaps you should decide to buy the kids a new kite. syntax: sin( number ) the sin function returns the sine of the given angle. the argument number is the angle, in radians, for which you want the sine. if your argument is in degrees, you multiply it by pi() / 180 to convert it to radians. using cos to figure out a ladder’s length every year, my wife, mary ellen, hires kevin the landscaper to hang a huge holiday wreath on the second story of our house. the holidays come and go, and i find that kevin is wintering in florida. the ladder that i own is not long enough to reach the wreath. much to the humor of my neighbors, i stand next to the house, with my too-short ladder, and assess the situation. figure 15.6 shows that i am 10 feet from the house, and the angle to the wreath hanger is 55 degrees. how long of a ladder do i need to borrow from the neighbors? table 15.4, which was included earlier in this chapter, shows that the cos function determines the relationship between the adjacent side and the hypotenuse. to find the length of the ladder, follow these steps: 1. in excel, enter cos(radians(55)). the result is 0.574. 2. create the equation adjacent / hypotenuse 0.574. 3. divide both sides of the equation by 10. this tells you that the 1 / hypotenuse is 17.43. figure 15.5 the sin func-tion can find the height of this tree when you know the length of the string.513 a br ief review of tr igonomet ry bas i cs 15 chapter 4. divide both sides of the equation into 1. the result tells you that the hypotenuse is almost 17.5 feet. it looks like i had better visit dick, the neighbor with the 18-foot ladder. syntax: cos( number ) the cos function returns the cosine of the given angle. the argument number is the angle, in radi-ans, for which you want the cosine. if the angle is in degrees, you multiply it by pi() / 180 to con-vert it to radians. figure 15.6 the cos function can find the length of the ladder needed to reach the objective. excel in practice: measuring the distance across a canyon have you ever seen a pair of surveyors working in your neighborhood? one of the pair is hold-ing a tall pole, and the other person is looking through a sighting device. the surveyor can use trigonometry to measure distances or the angle of decline of a piece of land. to try your surveying skills, you can measure the distance across a canyon. you start by stand-ing on one side of the canyon with a sighting tool. have your friend stand on the other side of the canyon, holding a 6-foot pole. the angle from the sighting device to the bottom of the 6-foot pole will be ridiculously small, but measurable. you find that the angle comes out to 0.006 degrees. if you know the height of the opposite side is 6 feet and the angle is 0.006 degrees, you can find the distance across that portion of the canyon by using trigonometry. table 15.4 defines the tangent as the length of opposite / adjacent. now that you have this information, you can follow these steps to find the distance across the canyon: 1. to convert 0.006 degrees to a tangent, use tan(radians(0.006)) . the result, 0.000105 , is 6 / adjacent. 2. multiply both sides of the equation by adjacent. divide both sides of the equation by 0.000105.us ing tr ig, mat r ix, and engineer ing funct ions 514 2 part using the arc functions to find the measure of an angle if you know the lengths of two sides of a right triangle, you can determine the angles of the triangle by using trigonometry. the arc function converts a sine value to an angle, in radians. suppose that you know the opposite side of a triangle has a length of 3 and the hypotenuse has a length of 5. the sine value is opposite / hypotenuse, or 0.6. you use asin(0.6) to convert the sine back to the measure of the angle. excel provides functions to reverse all three of the basic trig functions. you use acos to reverse cos, asin to reverse sin, and atan to reverse tan. figure 15.8 demonstrates how to use acos, asin, and atan to find the angle size of a right triangle. keep in mind that the three angles in a triangle always add up to 180. because you know that the 3. in cell f16 in figure 15.7 , the formula 6/0.000105 indicates that the canyon is 57,142 feet across. 4. in cell f17, divide f16 by 5,280 to find that the canyon is 10.82 miles across at that point. even if you are evil knievel, you probably do not want to attempt to jump across in your rocket-powered motorcycle. figure 15.7 you can calculate dis-tances across a lake or canyon by using trigo-nometry. note the result of asin(0.6) produces the size of the angle, in radians. to convert from radians to degrees, you use degrees(asin(0.6)) .515 a br ief review of tr igonomet ry bas i cs 15 chapter right angle is 90 degrees, and figure 15.8 calculates the second angle as 37 degrees, the third angle must be 53 degrees. figure 15.8 the arc functions find an angle from the ratio of two sides of the triangle. syntax: acos( number ) the acos function returns the arccosine of a number. the arccosine is the angle whose cosine is number. the returned angle is given in radians, in the range 0 to π. the argument number is the cosine of the angle you want and must be from –1 to 1. if you want to convert the result from radi-ans to degrees, multiply it by 180 / pi() or use the degrees function. syntax: asin( number ) the asin function returns the arcsine of a number. the arcsine is the angle whose sine is number. the returned angle is given in radians, in the range –π / 2 to π /2. the argument number is the sine of the angle you want and must be from –1 to 1. to express the arcsine in degrees, multiply the result by 180 / pi().us ing tr ig, mat r ix, and engineer ing funct ions 516 2 part syntax: atan( number ) the atan function returns the arctangent of a number. the arctangent is the angle whose tangent is number. the returned angle is given in radians, in the range –π / 2 to π / 2. the argument number is the tangent of the angle you want. to express the arctangent in degrees, you multiply the result by 180 / pi(). using atan2 to calculate angles in a circle figure 15.9 shows a unit circle. this is a circle with a radius of 1, plotted on a cartesian grid. the point on the right side of the circle has a value of x 1 and y 0. this is defined as the angle at zero degrees. the point at the top of the circle has a value of y 1 and x 0. this is defined as the angle at 90 degrees. given the coordinates of any two points on the circle, or of any two points anywhere, you can calcu-late the angle by using the atan2 function. figure 15.9 you use atan2 to find the angle from the x-axis to any point in cartesian coordinates.517 a br ief review of tr igonomet ry bas i cs 15 chapter syntax: atan2( x_num,y_num ) the atan2 function returns the arctangent of the specified x- and y-coordinates. the arctangent is the angle from the x-axis to a line containing the origin (0, 0) and a point with coordinates ( x_num, y_num). the angle is given in radians, between –π and π, excluding –π. a positive result represents a counterclockwise angle from the x-axis; a negative result represents a clockwise angle. this function takes the following arguments: • x_num—this is the x-coordinate of the point. • y_num—this is the y-coordinate of the point. atan2(a,b) equals atan(b/a), except that a can equal 0 in atan2. if both x_num and y_num are 0, atan2 returns a #div/0! error. to express the arctangent in degrees, you multiply the result by 180 / pi() or use the degrees function. the formulas in column c of figure 15.9 find the atan2 of the points in columns a and b. the result must be converted to degrees by using degrees(atan2(a2,b2)). emulating gravity using hyperbolic trigonometry functions you can apply the trigonometry functions shown so far in this chapter to solve problems in your environment. the hyperbolic trigonometry functions, which we examine next, are far more complex. as shown in figure 15.10, the hyperbolic cosine function, cosh, is effective at graphing the arc of a rope hung between two points. according to mathworld.com, other uses for hyperbolic trigonometry include the following: • calculating the gravitational potential of a cylinder • calculating the rapidity of special relativity • calculating the profile of a laminar jet • calculating the schwarzschild metric, using external isotropic kruskal coordinates • emulating a uniform gravity field by a uniform acceleration, in general relativity • these are complex tasks and i won’t fill you in on the details here. however, if you need to calcu-late the profile of a laminar jet, head to mathworld.com for details. excel offers the hyperbolic functions sinh, cosh, and tanh, as well as the reverse functions asinh, acosh, and atanh. syntax: sinh( number ) the sinh function returns the hyperbolic sine of a number. the argument number is any real number.us ing tr ig, mat r ix, and engineer ing funct ions 518 2 part syntax: cosh(number) the cosh function returns the hyperbolic cosine of a number. the argument number is any real number for which you want to find the hyperbolic cosine. in figure 15.10, the cosh function is used in column b to calculate the path of a rope hanging between two points. syntax: tanh( number ) the tanh function returns the hyperbolic tangent of a number. the argument number is any real number. figure 15.10 this shape defined by cosh is also known as a catenary.519 examples of logar i thm funct ions 15 chapter syntax: asinh( number ) the asinh function returns the inverse hyperbolic sine of a number. the inverse hyperbolic sine is the value whose hyperbolic sine is number, so asinh(sinh(number)) equals number. the argu-ment number is any real number. syntax: acosh( number ) the acosh function returns the inverse hyperbolic cosine of a number. number must be greater than or equal to 1. the inverse hyperbolic cosine is the value whose hyperbolic cosine is number, so acosh(cosh(number)) equals number. the argument number is any real number equal to or greater than 1. syntax: atanh( number ) the atanh function returns the inverse hyperbolic tangent of a number. number must be between –1 and 1, excluding –1 and 1. the inverse hyperbolic tangent is the value whose hyperbolic tangent is number, so atanh(tanh(number)) equals number. the argument number is any real number between 1 and –1. examples of logarithm functions if you have read many of my books, you know that i used to have a day job involving forecasting and operations planning. i was constantly battling with the sales force to provide accurate sales forecasts. at the end of each month, we produced a chart to show the forecasted demand and the actual demand. if the forecast and actual were within 15 percent of each other, this was considered a tolerable error, and no discussion was necessary. however, for any points outside the 15 percent tolerance, a team would figure out why we missed the forecast and how to prevent a similar miss in future months. the initial charts looked horrible. there were 20 products being forecasted, and the monthly demand fell by anywhere from 50 units a month to 10,000 units a month. there were only a few products above the 5,000-unit level, but those few products made it impossible to see any detail for the 17 smaller products, as shown in figure 15.11. rather than produce several different charts, our solution involved giving the y-axis of the chart a logarithmic scale.us ing tr ig, mat r ix, and engineer ing funct ions 520 2 part common logarithms on a base-10 scale in a logarithmic scale, the distance from 1 to 10 on the scale is the same as the distance from 10 to 100 and the same as the distance from 100 to 1,000 and the same as the distance from 1,000 to 10,000. each gridline basically appears at 101, 102, 103, 104, and so on. the resulting chart allows you to see detail for the items selling 100 units as well as the items sell-ing 8,000 units. figure 15.12 shows the result of converting the chart in figure 15.11 to a chart with a logarithmic y-axis. figure 15.11 no one can make out any detail for the smaller val-ues on this chart. the scale of the two or three large products ruins the view of the smaller products. basically, a logarithm raises a number—the base—to a certain power. in the case of the chart in figure 15.12, each plot on the chart is located at a certain power of 10. in figure 15.13, columns b:e show the original numbers for the table. columns g:j show the base-10 logarithm for the number. 101 is 10. 102 is 100. the number in cell b3 is 98. this logarithm is going to be between 1 and 2, and probably much closer to 2. the formula in cell g3 reveals that if 10 is raised to the 1.99126th power, you get 98. as another example, 102 is 100, and 103 is 1,000. cell b17 contains 5,100. the logarithm for 5,100 is somewhere between 2 and 3. the formula in cell g17, log10(b17), shows that 103.707 results in 5,100. excel offers four functions for dealing with logarithms. log10 calculates the logarithms based on raising 10 to a certain power. log can calculate the logarithm for any base. ln and exp deal with a special logarithm.521 examples of logar i thm funct ions 15 chapter figure 15.12 you can change the y-axis to show a logarithmic scale, and the detail of the smaller quantities becomes clear. figure 15.13 the table in g:j is the base-10 logarithm of the numbers in b:e.us ing tr ig, mat r ix, and engineer ing funct ions 522 2 part syntax: log10( number ) the log10 function returns the base-10 logarithm of a number. the argument number is the positive real number for which you want the base-10 logarithm. using log to calculate logarithms for any base excel makes it simple to calculate the logarithm for any base, using the log function. cell b2 of figure 15.14 contains the formula log(a2,2) to express the number in column a as a base-2 logarithm. cell e2 contains the formula log(d2,2) to express the number in column e as a base-5 logarithm. figure 15.14 the log function can calculate a logarithm with any base. syntax: log( number,base ) the log function returns the logarithm of a number to the specified base. it takes the following arguments: • number—this is the positive real number for which you want the logarithm. • base—this is the base of the logarithm. if base is omitted, it is assumed to be 10.523 examples of logar i thm funct ions 15 chapter using ln and exp to calculate natural logarithms only two logarithms are used frequently in science. the first is the base-10 logarithm that was dis-cussed previously. the second is a natural logarithm where numbers are expressed as a power of the number e. e is a special number. you can calculate e by adding up all the numbers in the series of 1 [1 / (1!)] [1 / (2!)] [1 / (3!)] [1 / (4!)] [1(5!)] [1 / (7!)] [1(8!)] [1 / (9!)] [1 / (10!)] .... luckily, 10! is 3.7 million, so 1 / (10!) is a very small number: 0.000000275573. after about 1 / (17!), the numbers are small enough that they are beyond excel’s 15-digit precision. this infinite series converges toward a number around 2.718281. this number is known as the transcendental number, which is abbreviated as e. logarithms for base e are known as natural loga-rithms. you can calculate e in excel by using a range such as the one shown in a4:c22 in figure 15.15, or you can use exp(1), as shown in cell c24. natural logarithms are very popular in science because anything with a constant rate of growth follows a curve described by natural logarithms. radioactive isotopes, for example, decay along a curve described by natural logarithms. whereas common logarithms with base 10 are called logs, natural logarithms with base e are writ-ten as ln, which is often pronounced lon. you calculate natural logarithms by using the ln function. syntax: ln( number ) the ln function returns the natural logarithm of a number. natural logarithms are based on the con-stant e, which is 2.71828182845904. the argument number is the positive real number for which you want a natural logarithm. little twelvetoes here is a simple test to see if you attended the same saturday morning school that i did. fill in this phrase: “conjunction junction, ______ ______ ______?” if you instinctively sang, “what’s my function?” then you are a fellow alumnus of the school of tom yohe and david mccall. from 1973 until 1985, abc snuck in educational cartoons in the middle of its other saturday morning fare. known collectively as school house rock , these seg-ments taught children multiplication tables, grammar, science facts, and american history. perhaps the most ambitious segment was the “multiplication rock” segment, about an alien planet where everyone had 12 toes. in this system, there are new digits after 9: “dek, el, do. in addition, his 12—do—is written 1-0. get it?” this little 60-second cartoon and jingle introduced a generation of children to the concept of a base-12 numbering system in a way that made per-fect sense. column h of figure 15.14 uses log(x,12) to express logarithms in a base-12 system.us ing tr ig, mat r ix, and engineer ing funct ions 524 2 part with common logarithms, you can convert the logarithm back to the original number by using 10x. however, it is fairly difficult to write 2.71828182845904x. therefore, excel provides the function exp to raise e to any power. syntax: exp( number ) the exp function returns e raised to the power of number. the constant e equals 2.71828182845904, the base of the natural logarithm. the argument number is the exponent applied to the base e. to calculate powers of other bases, you use the exponentiation operator (). exp is the inverse of ln, the natural logarithm of number. to convert the logarithms in column b in figure 15.16, use exp(b2), as shown in column c. figure 15.15 the calculation of e is fairly complex, as shown in a4:c22. instead, you can use exp(1). multiplying and dividing by adding and subtracting think about the problem 34 x 37. in this problem, both of the base numbers are the same. the result is 3(74), or 311. similarly, if you want to divide 721 by 75, you can find the solution by subtracting: 7(21 – 5), or 716.525 examples of logar i thm funct ions 15 chapter the decay of radioactive isotopes follows a natural logarithmic curve. the basic formula is as fol-lows: number of atoms after time t original number of atoms e(t constant). for radium 226, the constant is –0.000436. the table in figure 15.17 shows how to raise e to a certain power by using a table of years. you can see that about half the original sample will have decayed after 1,500 years! in figure 15.16 , e2:h9 walks through a long-winding way of multiplying using only ln and addi-tion. to multiply 4.215 x 7.643, you take the ln of each number in cells f4 and f5. you can then add these numbers in cell f6. the formula in cell g6 uses exp to find the actual answer of 32.21525. now, i realize that this all seems ridiculous because if you are doing this, you obvi-ously have excel and can just do the multiplication directly, as shown in cell g8. however, this is an interesting property of logarithms. figure 15.16 to reverse the ln function, you use exp .us ing tr ig, mat r ix, and engineer ing funct ions 526 2 part working with imaginary numbers multiply the number 2 by itself: 22 is 4. the square root of 4 is 2. multiply the number –2 by itself: -22 is also 4. excel says sqrt(4) is 2, but clearly it can also be –2 as well. so, what is the square root of –4? there is no real number that produces –4 when multiplied by itself. excel says that sqrt(-4) is #num!. to deal with theoretical numbers where the square root is a negative number, mathematicians invented the concept of the imaginary number, i. this number is the square root of –1. at first, no one was sure if this were relevant, so these numbers were given the name imaginary numbers. since their invention, imaginary numbers have been discovered to have real-world applications. they are used extensively in the physics of electrical circuits. the name imaginary continues to stick. in the parlance of imaginary numbers, the square root of –4 is 2i. often, the answer to a problem appears as an expression such as a b i. in this case, a and b are both real numbers. this expression is a complex number. you can plot complex numbers on a coordinate graph, plotting a along the x-axis and b along the y-axis, and then do trigonometry with imaginary numbers. figure 15.17 for constant growth or decay problems, you can use exp to raise e to a power.527 working wi th imaginary numbers 15 chapter excel offers nine functions that deal with imaginary, or complex, numbers: complex, imreal, imaginary, imsum, improduct, imdiv, imabs, imargument, and imconjugate. using complex to convert a and b into a complex number it is hard to deal with complex numbers in excel because they are basically text. think about how you can store 5 2i in a cell; it will be difficult to do. you can create a large range of complex numbers in the form a bi if you have ranges of values for a and b. in figure 15.18, pairs of a and b values are stored in the first two columns of a worksheet. the complex function in column c converts these numbers to complex numbers. figure 15.18 the complex function builds text results in column c. the eight im functions can do math on these text values. syntax: complex( real_num,i_num,suffix ) the complex function converts real and imaginary coefficients into a complex number in the form x yi or x yj. this function takes the following arguments: • real_num—this is the real coefficient of the complex number. • i_num—this is the imaginary coefficient of the complex number. note all complex number functions accept i and j in the suffix. however, they will not accept either i or j. using uppercase results in a #value! error. all functions that accept two or more complex numbers require that all suffixes match.us ing tr ig, mat r ix, and engineer ing funct ions 528 2 part • suffix—this is the suffix for the imaginary component of the complex number. if omitted, the suffix is assumed to be i. if real_num is nonnumeric, complex returns a #value! error. if i_num is nonnumeric, complex returns a #value! error. if suffix is neither i nor j, complex returns a #value! error. using imreal and imaginary to break apart complex numbers complex numbers are in the form a bi, where i is the imaginary square root of –1. excel stores all complex numbers as text. if you use any of the im functions to generate new complex numbers, you can extract the numbers a and b by using imreal and i maginary. in figure 15.19, column a contains a range of complex numbers. the formulas in column b extract the real number portion of the complex number. the formulas in column c extract the value that is multiplied by i in the complex number. figure 15.19 imreal and imaginary break a complex number expression in the form a bi into the numbers for a and b. syntax: imreal( inumber ) the imreal function returns the real coefficient of a complex number in x yi or x yj text format. the argument inumber is a complex number for which you want the real coefficient. if inumber is not in the form x yi or x yj, imreal returns a #num! error.529 working wi th imaginary numbers 15 chapter syntax: imaginary(inumber) the imaginary function returns the imaginary coefficient of a complex number in x yi or x yj text format. the argument inumber is a complex number for which you want the imaginary coef-ficient. if inumber is not in the form x yi or x yj, imaginary returns a #num! error. using imsum to add complex numbers figure 15.20 shows two columns of complex numbers. a complex number is in the form a bi. both a and b are real numbers. the letter i is the imaginary square root of –1. note that all of the “numbers” stored in columns a and b are stored as text. figure 15.20 even though all the complex numbers in columns a and b are text, the imsum function adds them with ease. to add ( a bi) ( c di), you use the formula ( a b) ( c d) i. you use imsum to calculate this. syntax: imsum(inumber1,inumber2,...)us ing tr ig, mat r ix, and engineer ing funct ions 530 2 part the imsum function returns the sum of two or more complex numbers in x yi or x yj text format. the arguments inumber1,inumber2,... are 1 to 255 complex numbers to add. if any argument is not in the form x yi or x yj, imsum returns a #num! error. using imsub , improduct , and imdiv to perform basic math on complex numbers as with the imsum function, similar rules exist for subtracting, multiplying, and dividing complex numbers. these are numbers stored as text in the form a bi, where the constant i is an imaginary number representing the square root of –1. these are the rules for the imsub, improduct, and imdiv functions: • to subtract complex numbers, you use imsub. the formula for ( a bi) – ( c di) is (a – c) ( b – d) i. • to multiply complex numbers, you use improduct. the formula for (a bi) (c di) is (ac – bd) (ad bc) i. • to divide complex numbers, you use imdiv. the formula for (a bi) / (c di) is [(ac bd) (bc – ad) i] / (c2 d2). figure 15.21 shows the results of the basic math functions for complex numbers. syntax: imsub(inumber1,inumber2) the imsub function returns the difference between two complex numbers in x yi or x yj text for-mat. this function takes the following arguments: • inumber1—this is the complex number from which to subtract inumber2. • inumber2—this is the complex number to subtract from inumber1. if either number is not in the form x yi or x yj, imsub returns a #num! error. syntax: improduct( inumber1,inumber2,... ) the improduct function returns the product of 2 to 255 complex numbers in x yi or x yj text for-mat. the arguments inumber1, inumber2,... are 1 to 255 complex numbers to multiply. if inumber1 or i number2 is not in the form x yi or x yj, improduct returns a #num! error. syntax: imdiv( inumber1,inumber2 ) the imdiv function returns the quotient of two complex numbers in x yi or x yj text format. this function takes the following arguments:531 working wi th imaginary numbers 15 chapter • inumber1—this is the complex numerator or dividend. • inumber2—this is the complex denominator or divisor. if inumber1 or inumber2 is not in the form x yi or x yj, imdiv returns a #num! error. figure 15.21 you can per-form basic math with complex numbers. using imabs to find the distance from the origin to a complex number a complex number is in the form a bi, where i is an imaginary number representing the square root of –1. to plot complex numbers on a cartesian grid, you use a for the x-axis and b for the y-axis. the imabs function calculates the distance from the (0, 0) origin in the grid. if you have a complex number in the form a bi, the formula for an absolute value is sqrt(a2b2). this results in a real number. syntax: imabs( inumber ) the imabs function returns the absolute value, or modulus, of a complex number in x yi or x yj text format. the argument inumber is a complex number for which you want the absolute value. if inumber is not in the form x yi or x yj, imabs returns a #num! error. figure 15.22 shows imabs functions for several complex numbers. note that the result of imabs(abi) is equal to imabs(bai).us ing tr ig, mat r ix, and engineer ing funct ions 532 2 part using imargument to calculate the angle to a complex number a complex number is in the form a bi, where i is an imaginary number representing the square root of –1. to plot complex numbers on a cartesian grid, you use a for the x-axis and b for the y-axis. the angle to a complex number assumes that the x-axis is 0 and rotates counter-clockwise. to find the angle, in radians, to any complex number plotted on a grid, you use imargument. b14:b23 in figure 15.22 shows the angle for several complex numbers. syntax: imargument( inumber ) the imargument function returns the angle () for an imaginary number. inumber is a complex number for which you want to calculate theta. if inumber is not in the form x yi or x yj, imargument returns a #num! error. figure 15.22 taking the absolute value of a complex number results in a real number.533 working wi th imaginary numbers 15 chapter using imconjugate to reverse the sign of an imaginary component a complex number is in the form a bi, where i is an imaginary number representing the square root of –1. to plot complex numbers on a cartesian grid, you use a for the x-axis and b for the y-axis. the imconjugate function creates a mirror image of a point, flipped across the x-axis. put another way, the function changes the sign of the imaginary component. for example, 10 3ibecomes 10 – 3i, and 10 – 3i becomes 10 3i. syntax: imconjugate( inumber ) the imconjugate function returns the complex conjugate of a complex number in x yi or x yj text format. the argument inumber is a complex number for which you want the conjugate. if inumber is not in the form x yi or x yj, imconjugate returns a #num! error. figure 15.23 shows the results of several imconjugate formulas. figure 15.23 you can reverse the sign of the imaginary compo-nent of a complex number with imconjugate . calculating powers, logarithms, and trigonometry functions with complex numbers the remaining eight im functions calculate powers, exponents, logs, and trig functions from complex numbers: • imsqrt—calculates the square root of a complex number. • impower—raises a complex number to a certain power.us ing tr ig, mat r ix, and engineer ing funct ions 534 2 part • imlog10—calculates the base-10 logarithm or common logarithm of a complex number. • imlog2—calculates the base-2 logarithm of a complex number. • imexp—raises the constant e to a complex number. for more information about the imexp function, see “using ln and exp to calculate natural logarithms” covered earlier in this chapter. • imln—calculates the natural log of a complex number. • imsin—calculates the sine of a complex number. • imcos—calculates the cosine of a complex number. figure 15.24 shows the results of these functions for a complex number. figure 15.24 these func-tions calcu-late powers, logs, and trig func-tions, using text-based complex numbers. solving simultaneous linear equations with matrix functions the solver add-in can be used to solve simultaneous equations. however, excel also offers three matrix functions that you can use to solve these equations. although the math involved is beyond the scope of this book, the steps to produce an answer are fairly straightforward. the following is a problem taken from a math textbook in the han dynasty. the solution can be derived by using matrix functions in excel. there are three types of grain. three bundles of the first, two of the second, and one of the third make 39 bushels. two of the first, three of the second, and one of the third make 34 bushels. one535 solving simul taneous linear equat ions wi th mat r ix funct ions 15 chapter of the first, two of the second, and three of the third make 26 bushels. how many bushels are in the bundles of each type of grain? to solve this problem, follow these steps: 1. convert the problem’s words into algebraic equations. assuming that the first type of grain is a, the second is b, and the third is c, you have these three equations: 3a 2b 1c 39 2a 3b 1c 34 1a 2b 3c 26 2. in excel, set up three columns with headings a, b, and c. in the three rows below these columns, enter the coefficients from each equation. for example, the first row contains 3, 2, and 1. the second row contains 2, 3, and 1. the third row contains 1, 2, and 3. in figure 15.25, the range c5:e7 contains the matrix of coefficients. 3. in another range, enter a matrix of the answers for each equation. this range should be one col-umn wide by three rows tall. the cells should contain 39, 34, and 26. in figure 15.25, this range is in g5:g7. figure 15.25 amazingly, excel can solve simul-taneous equations by using a pair of matrix func-tions. 4. select a new range that is the same size as the range in step 2. this range will hold an inter-mediate step with the inverse matrix. in the new range, type the formula minverse(c5:e7). do not press enter. instead, hold down ctrlshift while you press enter. this key combination tells excel to calculate an array and enter the results in all the selected cells (see range c10:e12 in figure 15.25).the inverse of an array is an array that, when multiplied by the original array, produces a new array with ones along the diagonal and zeros everywhere else. in figure 15.25, the range c15:e17 contains the array formula mmult(c5:e7,c10:e12). the result of the mmult operation is indeed a matrix with a one along the diagonal and zeros everywhere else,us ing tr ig, mat r ix, and engineer ing funct ions 536 2 part as shown in figure 15.25. array formulas are special multi-cell formulas that are entered using ctrlshiftenter. 5. select a range that is three cells high and one column wide. in this column, enter a mmult func-tion that multiplies the minverse array from step 4 by the answers in step 3. in figure 15.25, the formula in i5:i7 is mmult(c10:e12,g5:g7). again, you must select all three cells before enter-ing this formula. next, you must hold down ctrlshiftenter to enter the formula. the results in cells i5, i6, and i7 stand for the values of a, b, and c, respectively. 6. to make sure that everything worked, set up test formulas in column k. for example, the test formula in k5 checks to see if 3a2bc equals 39. this entire process is fairly amazing. all the formulas are live formulas. if you change one of the input variables in any of the ranges, all the matrix functions instantly recalculate to solve the three simultaneous equations. syntax: minverse( array ) the minverse function returns the inverse matrix for the matrix stored in an array. the argument array is a numeric array with an equal number of rows and columns. the array can be given as a cell range such as a1:c3. it can also be given as an array constant such as {1,2,3;4,5,6;7,8,9}. finally, it can be given as a name for either of these. formulas that return arrays must be entered as array formulas. to indicate that a formula is an array formula, type the formula, then hold down ctrl and shift while pressing enter. inverse matrices, like determinants, are generally used for solving systems of mathematical equations that involve several variables. the product of a matrix and its inverse is the identity matrix—the square array in which the diagonal values equal 1 and all other values equal 0. as an example of how a two-row, two-column matrix is calcu-lated, suppose that the range a1:b2 contains the letters a, b, c, and d, which represent any four numbers. table 15.5 shows the inverse of the matrix a1:b2. table 15.5 inverse of the matrix shown in a1:b2 (see figure 15.26 ) column a column b row 1 d/(a*d-b*c) b/(b*c-a*d) row 2 c/(b*c-a*d) a/(a*d-b*c) note if any cells in array are empty or contain text, minverse returns a #value! error. minverse also returns a #value! error if array does not have an equal number of rows and columns. tip an array can be entered in curly braces and is called an array constant. each comma in the array constant indicates that excel should move to the next column in the current row. each semi-colon indicates that excel should move to the next row. to picture the actual shape of {1,2,3;4,5,6;7,8,9}, picture the value 1 in a1, 2 in b1, 3 in c1, 4 in a2, and so on down to 9 in c3.537 solving simul taneous linear equat ions wi th mat r ix funct ions 15 chapter minverse is calculated to an accuracy of approximately 16 digits, which may lead to a small numeric error when the cancellation is not complete. thus, when you use mmult on this array with the original array, you might find 0.00000000000001 instead of 0 in some cells. some square matrices cannot be inverted and return a #num! error with minverse. the determinant for a noninvertible matrix is 0. the mmult function multiplies two arrays. the basic logic is that the top-left cell of the resulting array is the sum of multiplying the first row of array 1 by the first column of array 2. figure 15.27 shows the rest of the rules for a 2 2 matrix. syntax: mmult( array1,array2 ) the mmult function returns the matrix product of two arrays. the result is an array with the same number of rows as array1 and the same number of columns as array2. the arguments array1 and array2 are the arrays you want to multiply. the number of columns in array1 must be the same as the number of rows in array2, and both arrays must contain only numbers. array1 and array2 can be given as cell ranges, array constants, or references. if any cells are empty or contain text, or if the number of columns in array1 is different from the number of rows in array2, mmult returns a #value! error. in figure 15.27, a2:b3 contains array a. a6:b7 contains array b. the result of the mmult formula, array m, is in a10:b11. the rules for the calculation of each cell in m are shown in d2:d5. the actual formulas are shown in d10:d13. figure 15.26 range a6:b7 contains the minverse of the original array. when you multiply an array and its minverse array, the result-ing array in a10:b11 contains 1s along the diagonal.us ing tr ig, mat r ix, and engineer ing funct ions 538 2 part using mdeterm to determine whether a simultaneous equation has a solution if your matrix of simultaneous equations is square, excel can calculate a determinant of the array by using mdeterm. the determinant returns a single number, which means this function does not need to be entered as an array. if the determinant of an array is nonzero, the simultaneous equation has a solution. figure 15.28 shows the calculation for the determinant of a 2 2 matrix. syntax: mdeterm( array ) the mdeterm function returns the matrix determinant of an array. the argument array is a numeric array with an equal number of rows and columns. the array can be given as a cell range such as a1:c3; an array constant such as {1,2,3;4,5,6;7,8,9}; or a name to either of these. if any cells in array are empty or contain text, mdeterm returns a #value! error. mdeterm also returns #value! if array does not have an equal number of rows and columns. the matrix determinant is a number derived from the values in array. for a three-row, three-column array, a1:c3, the determinant is defined as follows: mdeterm(a1:c3) a1*(b2*c3-b3*c2) a2*(b3*c1-b1*c3) a3*(b1*c2-b2*c1) figure 15.27 the mmult function performs matrix multiplication.539 solving simul taneous linear equat ions wi th mat r ix funct ions 15 chapter matrix determinants are generally used for solving systems of mathematical equations that involve several variables. mdeterm is calculated with an accuracy of approximately 16 digits, which may lead to a small numeric error when the calculation is not complete. for example, the determinant of a singular matrix may differ from zero by 1e – 16. figure 15.28 shows a mdeterm calculation for a 22 array. figure 15.28 mdeterm returns the determinant of any square array. determinants that are nonzero indicate that the simul-taneous equations have a solution. using seriessum to approximate a function with a power series there are situations in mathematics in which a value can be approximated by summing many factors in a series. if the series gets progressively smaller such as 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, the numbers eventually become smaller than excel’s 15-digit significance limit. this is referred to as a power series. in a power series, the exponent of each term is progressively changed. an example of a power series is a 1 x1 a 2 x3 a 3 x5 a 4 x7 a 5 x9. figure 15.29 shows a long, complex calculation. the coefficients in column d are found by dividing factorials of even number digits into the number 1 and then multiplying every other value by –1. the value of x is 60 degrees, or pi() / 3. the exponents shown in column e increase from 0 to 16 by 2s. in column f, you raise x to the power in column e. in column g, you multiply column d by column f. finally, you add up all the values in column g to arrive at 0.5, which is a really good approxima-tion of the cosine of 60 degrees. this example is rather trivial because excel offers a cos function. however, other functions use a power series to approximate a function. for example, one seriessum function in b17 replaces all the calculations in columns e, f, and g. the function needs the list of coefficients in column d. in fact, the number of coefficients tells excel how far to extend the series. syntax: seriessum( x,n,m,coefficients ) the seriessum function returns the sum of a power series. many functions can be approximated by a power series expansion. this function takes the following arguments: • x—the input value to the power series.us ing tr ig, mat r ix, and engineer ing funct ions 540 2 part • n—the initial power to which you want to raise x. • m—the step by which to increase n for each term in the series. • coefficients—a set of coefficients by which each successive power of x is multiplied. the number of values in coefficients determines the number of terms in the power series. for example, if there are three values in coefficients, there will be three terms in the power series. if any argument is nonnumeric, seriessum returns a #value! error. using sqrtpi to find the square root of a number multiplied by pi the sqrtpi function multiplies a number by π and then takes the square root of the result. in the previous editon of this book, i was at a loss to explain a use for this function. the difficulty in find-ing uses for sqrtpi and doublefact became a running joke in my power excel seminars. finally, someone from custom metalcraft in springfield, missouri, pointed out that sqrtpi is used when you need to figure out what size of a square tank is equivalent to a certain size round tank. as a general example, figure 15.30 shows how to use sqrtpi to find that a pizza that is 10.6” square contains the same area as a 12” round pizza. figure 15.29 the seriessum func-tion can calculate a power series, given a value x, a pattern for the exponents, and a list of coefficients.541 solving simul taneous linear equat ions wi th mat r ix funct ions 15 chapter syntax: sqrtpi( number ) the sqrtpi function returns the square root of π. the argument number is the number by which π is multiplied. if number is less than 0, sqrtpi returns a #num! error. sqrtpi(5) calculates 5*pi() as 15.7 and then takes the square root of 15.7, to return 3.96. using sumproduct to sum based on multiple conditions the use of sumproduct is dropping dramatically since excel 2007. until this point, sumproduct was one of the favorite methods for solving a particular limitation with sumif. however, since microsoft added the sumifs function to excel 2007, there is less need for sumproduct. in case you need to share your workbooks with people using legacy versions of excel, you can work through this example to solve the problem of conditionally summing a range based on two condi-tions. suppose you are starting with the data in column a:c of figure 15.31. this simple data set has fields for region, product, and sales. the sumif command can add all the sales that occurred in the east: sumif(a2:a17,”east”,c2:c17). however, there is no way to use sumif to find the sum of all records that are in the east and for product a. using sumproduct to solve this problem requires you to think about a couple virtual arrays. ‘these arrays are entered as intermediate steps in figure 15.31 so you can picture them: figure 15.30 sqrtpi is useful for converting round areas to square areas.us ing tr ig, mat r ix, and engineer ing funct ions 542 2 part • in column e, the formula tests whether the cell in column a is equal to east. • in column f, the formula tests whether the cell in column b is equal to a. • column g contains an interesting formula. cell g2 multiplies the sales in cell c2 by the true/ false value in cell e2 and then multiplies that by the true/false value in cell g2. in excel’s treatment of true/false values, a true is calculated as a 1, and a false is calculated as a 0. thus, in cell g2, the 8 true true is like multiplying 8 1 1, which results in 8. • if either cell in column e or column f is false, excel treats the value as a zero. because zero times anything is zero, the result in column g shows up as zero if the corresponding value in either column e or column f is false. • in cell g18, a sum function totals the products from column g to answer how many sales of product a were made in the east. the sumproduct function does all the steps from columns e, f, and g in a single function, as shown in cell b23 in figure 15.31. there is a strange problem when using sumproduct to multiply arrays that contain true or false. although boolean logic says that true true is true, the sumproduct function can not do this operation. thus, you have to change the array of true/false values to an array of 0’s and 1’s. there are three generally accepted ways of doing this. figure 15.31 the rather long and winding calcu-lations in e2:g18 answer how many units meet two conditions. a single formula in b23 replaces all these steps.543 examples of engineer ing funct ions 15 chapter method 1 replaces the commas indicated in the syntax with asterisks. this forces the true values in the arrays to become 1’s and the false values to become 0. sumproduct((c2:c17)*(a2:a17a23)*(b2:b17b22)) method 2 surrounds the true/false arrays with the n() function. sumproduct(c2:c17,n(a2:a17a23),n(b2:b17b22)) method 3 uses a double negative before each true/false array to change the true/false values to 1/0. sumproduct(c2:c17,--(a2:a17a23),--(b2:b17b22)) syntax: sumproduct( array1,array2,array3,... ) the sumproduct function multiplies corresponding components in the given arrays and returns the sum of those products. the arguments array1, array2, array3,... are 2 to 255 arrays whose components you want to multiply and then add together. the array arguments must have the same dimensions. if they do not, sumproduct returns a #value! error. sumproduct treats array entries that are not numeric as if they were zeros. to solve a problem that has multiple conditions, you need to create three virtual arrays in the func-tion arguments. here’s how you do it: 1. make the first array the sales in c2:c17. 2. make the second array a test to see if a is equal to east. this will be (a2:a17”east”). 3. make the third array a test to see if b is equal to a. this wil