D R A F T
A spreadsheet can simply contain a series of items, arranged horizontally or vertically,
but more commonly, they are arranged in a list format. This format is useful in its own
right, as a clear way to organize information for later representation in a visualization,
and most spreadsheet applications need the data in the list format for graphing and other
In a list, the vertical columns are used to group like things
together (i.e. attributes), and each horizontal row is dedicated
to enumerating those attributes for a particular entity. The top
row, called the header, is usually dedicated to a list of names
that describes the attributes, called fields. In this example, the
rows contain individual people and the columns contain
attributes about that person (their name, age and sex) under
the appropriate field names.
Freezing of columns/rows
In larger data sets, it is useful to always know what field name you are currently viewing
once the top header line that defines them has scrolled out of view. You can "freeze" the
rows or columns at some point, so they do not scroll with the rest of the cell matrix to
provide a constant reminder of their meanings. In Excel, this is done by the cell whose
column and row you wish to retain the selecting the Freeze Panes item in the Window
menu. In Google Docs, this is accomplished by selecting the Freeze Rows or Freeze
Columns item in the Tools menu.
When you copy a cell to the clipboard, for later inclusion into another cell, Excel has to
make some hard choices: Does it need to copy the formula, or the end result of that
formula. For example, if a cell contained =A1+B1, there may be times when we want to
duplicate that formula for use in calculating different cells, and other times when we want
to copy the number value generated when A1 and B1 were added.
By default, Excel copies the formula, but if we want the value, Excel has a very useful
item called Paste Special in the Edit menu that offers some options to the way cells that
have be copied into the clipboard can be pasted into new cells. This brings up a dialog
box with some options under Paste, including one called Values which will only copy the
results of the formula's calculation and not the formula itself.
Transposing columns and rows
Sometimes the cells are arranged in the right direction for the immediate task. We may
need the cells to run horizontally instead of vertically, or vice-versa. The Paste Special
dialog box has a checkbox that facilitates this.
C# PowerPoint - Search and Find Text in PowerPoint
C# PowerPoint - Search and Find Text in PowerPoint. Learn How to Search Text in PDF Document and Obtain Text Content and Location Information. Overview. pdf find text; how to select text in pdf and copy
D R A F T
Most spreadsheet application offer a number of ways to change the appearance of the
cells, in terms of changing the text font, size, and color, and the cell's internal and
external color. From the data's point of view when used in a visualization, this graphical
formatting is largely ignored as window dressing, but in more complex sets of data, the
ability to graphically group items by color can shed insight into the nature of the data,
and make it easier to navigate through larger datasets.
Cell data formats
The native value of a cell falls in three primary categories: Text which represents letters
and numbers as a stream of characters; Numbers which represents whole and fractional
numbers; and Dates which reflect the idea of time. Within these three basic categories,
spreadsheets do offer some ways to format the data that makes it more understandable,
such as adding comma to separate thousands in large numbers. This formatting does not
change the underlying value of the cell, just how it is displayed.
For example, .045 and 4.5% represent the same number and mathematically, but the latter
can more understandably express value to people without sacrificing accuracy. Likewise,
consciously limiting the display of the number of decimal places (1.73346643 to 1.73)
can help remove the appearance of complexity and increase understanding while again
not sacrificing the accuracy of the base value. Likewise, the Number and Currency
formats which add 1,000's commas and the $ symbols help make clarify the data.
Forcing a number as text
By default, Excel formats cells in a format called General, which is a hybrid of the text
and number formats. If you type a number in a General cell, it will be treated as a number
for calculation purposes, and if you type text, it will treat it as text.
There are some circumstances where you actually want to treat a number as text. If you
were storing an archive record number such as 000623.0, the spreadsheet would assume
you wanted the value of that number and simplify it to 623. You can tell the spreadsheet
to treat the cell as a text value by setting its format to Text, or preceding the number with
an apostrophe (i.e. '000623.0) to force it to be considered as text. One visual cue that
spreadsheets offer as to the formatting is the alignment: Numbers are right-justified and
text is left justified.
D R A F T
Spreadsheets handle dates in a more complicated manner, so some of the results from
formulas and calculations may come out differently than expected. There are numerous
ways to store dates (i.e. 1800, 5/1800, 5/2/1800, May 2, 1800, etc.) but they all reflect the
same moment in time. Most spreadsheets convert that complex representation to a
simpler native format when doing calculations on dates, by storing dates internally as the
number of days (plus or minus) from January 1, 1900.
This allows for the proper sorting of dates, with 11/2/1800 showing up after 4/2/1800
instead of before if it were alphabetically sorted, and arithmetic operations such as
finding the number of days between two dates possible. For example, the formula
=11/2/1800 - 4/2/1800 would result in 214 days (7 months) to be returned to the cell.
There are a large number of built-in functions that make it easy to tease out the days,
weeks, months, and years from dates.
Sorting the data can be an excellent way to get some insight into the underlying
phenomena it is representing. For example, if we wanted to get a quick idea of how many
times the lunch menu had Tacos on Tuesday, we would sort the list by the weekday field,
followed by a secondary sort by the entree field. This would cause all the rows in the list
to first be group by the day, so all the Mondays would be grouped together, all the
Tuesdays, etc. The second sort would then sort the entrees alphabetically within
wise groupings to make a clump of Taco entries stand out from the rest.
Clicking on the Sort item in the Data menu in Excel (in the Tools menu from Google
Docs) will bring up a dialog box with the options to sort the list by. You can sort the data
up or down by column, and add then further sort the information based on the order in
other columns. The sort can be alphabetically ascending (i.e. A-Z) or descending.
Because of the sort is alphabetical, dates and numbers can sometimes be incorrectly
sorted. If your data has a header row, be sure to click on the button so that row does not
get sorted with the rest of the data.
XDoc, XImage SDK for .NET - View, Annotate, Convert, Edit, Scan
NET PDF SDK to Edit, Convert,. View, Write, Comment PDF files. Excel SDK to Edit, Convert, View. Write files. Learn More. easily find out our products capabilities pdf text search; search pdf for text in multiple files
D R A F T
Using the hundreds of built-in functions in modern spreadsheet, not just simple arithmetic
that can be performed between cells, but a wide variety of function ranging from simply
adding a series of numbers together to sophisticated statistical and financial operations.
Each function has a number of parameters within its parenthesis and returns the results of
their calculation back to the cell, or as a parameter to another function.
The number and kind of parameters for each function is dependent upon that particular
function. In Excel, all of the available functions are listed by their type by clicking on the
fx button to the left of the formula bar, grouped by category. Choosing one will bring up
its description and how to use it. Google Docs has many of the same functions available
by clicking on the Function item in the Insert menu.
For example, We could use the SUM function to add up a column
of numbers by using the formula =SUM(B2:B4) and place the
result in the cell. The parameter to the SUM function is the range
of cells to add up. The range of cells to add was specified by the
range B2:B4 to include B2, B3, and B4. The result from the SUM
function could be just as well used in another operation, say to
create an average: =SUM(B2:B4)/3, although there is an
AVERAGE function built-in for this common operation.
Commonly used functions
The following functions are commonly used in visualizations and grouped by the general
category they appear in. There are literally hundreds of functions available in most
spreadsheets, so it is worthwhile to look at the help screens to see if a more appropriate
one is available than the one listed below:
ABS (number) - Returns the absolute value (non-negative) of a number,
AVERAGE (range) - Returns the average (mean) of a range of numbers.
CEILING (number, 1) - Return the rounded-up integral part of a number.
FLOOR (number, 1) - Return the lower integral part of a number.
MAX (num1, num2, ... ) - Return the largest of 2 or more numbers.
MIN (num1, num2, ... ) - Return the smallest of 2 or more numbers.
SUM (range) - Returns the total of a range of numbers.
D R A F T
Date and Time
DATE (year, month, day) - Returns the date based on year, month and date.
DAY (date) - Returns the day number (1-31) from a date.
HOUR (date) - Returns the hour number (0-23) from a date.
MINUTE (date) - Returns the minute number (0-59) from a date.
MONTH (date) - Returns the month number (1-12) from a date.
WEEKDAY (date) - Returns the day text (Sunday-Saturday) from a date.
YEAR (date) - Returns the year number (1900-9999) from a date.
CLEAN (text) - Returns the text cleaned of any non-printable characters.
CONCATONATE (text1,text2, ... ) - Returns the union of 2 or more strings.
LOWER (text) - Returns the text in all lower-case letters.
PROPER (text) - Returns the text in title case (1st letter of words in caps).
TRIM (text) - Returns the with extraneous spaces removed.
UPPER (text) - Returns the text in all upper-case letters.
HLOOKUP (value, range, index) - Returns the matching value for value
found in a horizontal look up table (see section below on look-ups).
VLOOKUP (value, range, index) - Returns the matching value for value
found in a vertical look up table (see section below on look-ups).
FTEST (range1, range2) - Returns probably of 2 means being the same.
AVERAGE (range) - Returns the mean of a range of numbers.
MEDIAN (range) - Returns the median of a range of numbers.
MODE (range) - Returns the mode of a range of numbers.
PEARSON (range1, range2) - Returns the correlation between 2 ranges.
STDEV (range) - Returns the standard deviation of a range of numbers.
TTEST (r1, r2, tails, type) - Returns probably of 2 means being the same.
VAR (range) - Returns the variance of a range of numbers.
D R A F T
The ability to visually represent data is as valuable to the designers of interactive
visualizations as it is to its consumers. Spreadsheets make it easy to quickly create charts
and graphs to visually explore the data relationships. This can lead to insights into the
underlying phenomenon that can direct the kind of visualizations that will be most
Excel and Google Docs have excellent charting capabilities and work in a similar
manner. You select the table you want to chart by highlighting the cells that define it and
click on the Chart icon in the tool bar. A dialog box will offer a number of chart style and
you can instantly see a preview of that style using your data.
Choosing the right kind of graph
The various styles are useful in exploring different kinds of relationships between the
Time-Series relationships, where the values of data
are plotted vertically as time marches across are most
fruitfully rendered by line and area styles.
Quantitative relationships between items in
a data set are best drawn using bar, area and
Part-to-whole relationships, where the relative
value of one item is compared with the group is
well represented by pie, stacked bar/area charts.
Correlation relationships, where a number data
points are plotted using two variables are best
drawn using a scatter, and bubble charts.
Hierarchical/Organizational relationships between
individual members can yield insight using
organization maps, network diagrams, and trees.
A trend line is a line overlaid upon chart that "smoothes-out" the data an give you an
overall sense of the trend the data is taking, and is useful in seeing whether in general, the
values are increasing or decreasing over time. For more information, see the section on
Trends in Time series data on page 37. In Excel, you click on the Add trendline option in
the Chart menu.
D R A F T
While it is certainly possible manually type in data into a spreadsheet, most people
choose to import it from some existing source, which can be a text file, or a structured
table from a database or website page.
Data from text files
The goal of putting data in a spreadsheet is to organize that data in some meaningful way
using the fields have been defined. Unfortunately, many primary sources of data come
from largely unstructured sources, such as letters, documents and other largely prosaic
styles of organization.
It can be useful to use a word-processor, such as Microsoft Word or Google Docs to
prepare the text prior to import into the more rigid constraints of a spreadsheet by
creatively using the Find and Replace functionality to rearrange the data into the columns
and rows required. The text can be easily reformatted so that it will slot into the proper
columns and rows when pasted into the spreadsheet by making sure each row of data is
listed on its own line, with tabs between each field.
To aid in this process, Word has some special character that allow you to search for and
insert characters like new lines (^p) and tabs (^t). The Special button, exposed when the
More options button is active will show all the available characters. For example, the
following find and replace combination would look through the text for any double-line
feeds followed by a Title: as the delimiter to a record and replace it with a simple line
Find what: Title:^p^p Replace with: ^p
If the field values are separated by commas or some other separator, you will need to put
tabs so the spreadsheet recognizes them as distinct fields with a following combination
: Find what: , Replace with: ^t
Finally, Excel has an option in the Data menu called Text to Columns that will parse
some unformatted text in a cell or range of cells, and distribute the contents into columns
based on one or more defined delineators, such as a tab, comma, etc.
Documents you may be interested
Documents you may be interested