Skip to Main Content

Data: Finding Data

Discovery Tools for Existing Data

DATA SETS

The library provides access to the SimplyAnalytics tool that allows researchers to mine U.S. census data and sets of business data in order to generate maps or build data sets which can be exported into other tools such as Excel, SPSS, and SAS for further manipulation. See additional details about SimplyAnalytics on our SimplyAnalytics information and help video page.

Researchers can search for data sets using Databib - a searchable catalog / registry / directory / bibliography of research data repositories. These data sets must be imported (in some cases after format manipulation) into data tools such as SPSS and SAS. For additional information about Data support see the Purdue University Data site.

Another new initiative is Google's Dataset Search which attempts to harvest and federate scientific datasets. 

DOE Scientific Research Data - quickly see what data are available, where data collections reside, and go directly to the data; users can peruse recently added or revised content, view the hundreds of datasets, data streams, and data collections by title, display data from more than 50 subject categories, and select content by sponsoring or originating research organizations.

Open Science Framework (OSF) is another platform for storing and sharing research data.

Zenodo is a platform created by CERN as an open dependable home for the long-tail of science, enabling researchers to share and preserve any research outputs in any size, any format and from any science.

Another way to uncover data sets is to enter a web search with the term "site:statista.com"  .... the example format would be:  alcoholic beverages site:statista.com


DATA FROM WEB PAGES

Outwit is a software product that allows you to automatically harvest (capture and download) embedded links, associated documents and data files, and to scrape (identify and capture) text data from web pages and save them into spreadsheet format.

Examples of uses for this software might include: 

  • Creating a database of relevant links and metadata from a broad Google search,
  • Capturing all underlying data files from within a suite of web pages,
  • Capturing all images from a list of relevant web sites,
  • Capturing all text contained HTML lists within a set of related web sites, and
  • Capturing and converting embedded raw data into Excel spreadsheets.

There is great power of manipulation once you are able to harvest and re-purpose this type of information rather than simply look at it or manually re-create it using copy-and-paste methods across multiple web sites.


SOFTWARE

DiRT - Digital Research Tools Wiki - portal to help scholars (particularly in the humanities and social sciences)

Zanran Numerical Data Search is a search engine for extracting data and statistics from PDF documents. 

Tools for Importing New Data

There are many tools and platforms that allow for the importing of structured data.

In many cases, data must be configured according to specific standards before being ingested. Depending upon the type of data and intended manipulation, you may need to perform significant coding and modifications.

Examples include: converting time-dependent positional datum for GIS data sets; converting satellite data to standard calibrations before merging photometry for analysis, and geo-referencing maps with differing scales for overlay accuracy.

Copyright © 2013 | The Library at Saint Xavier University, 3700 W. 103rd St., Chicago, IL 60655 | Phone (773) 298-3352 | Fax (773) 298-5231 | Email: ask@sxu.libanswers.com | MyMail | MySXU