The application has been designed to illustrate functionality and has been populated with invented data and relationships to accomplish this purpose. While CrunchBase data and City of San Francisco location data has served as a starting point, the data in the application differs from these starting sources significantly (and not in a good way!).
Launch the application using JettyRunner: http://localhost:8080/search
The application uses Google Maps, and the Google API key will only permit access when using the above address.
The application will only produce results on exact matches.
The easiest way to get test data is to start off by querying a San Francisco zipcode where there happen to be many startups. PLEASE REMEMBER THAT THE DATA AND RELATIONSHIPS ARE NOT IN ANY WAY ACCURATE.
Here are some sample search terms for which the database has information:
Startup Name Searches
Investor Name Searches
Zipcode Searches
Searches on Individuals
The San Francisco Startup Atlas is a web application that tracks startup activity in San Francisco by plotting the location of startup companies on a map of the city. The intent is to illuminate the nexus between startups as logical/financial entities and their physical manifestations in terms of office space.
The application provides a simple interface that allows searching for startup companies based on a variety of criteria. The results are plotted on a Google map. Information on particular startups can be viewed by clicking on its map icon.
In addition to physical location, the application provides details regarding each startup's funding. The names of principals, founders, and investors can also be used as a starting point for searches.
The application contains data on past startups as well as present-day startups.
The San Francisco Startup Atlas is intended to serve two primary purposes:
The following are some core use-cases around which the San Francisco Startup Atlas is being developed:
This application is being developed as a class project for cs333 at University of San Francisco, taught by EJ Jung.
The San Francisco Startup Atlas is built with the following technology components:
Some of the data used by The San Francisco Startup Atlas has been extracted from CrunchBase. Taken as a whole, the data consists of approximately 10,000 records. Approximately one quarter of the database entries represent startup companies, with the rest representing funding entities, details on funding rounds, addresses, and individuals.
Thanks to everyone at CrunchBase for making this data publicly available!
Address and geolocation information has been extracted from data made available by the city of San Francisco.
As data is entered into the database irregularly and by batch, speed of lookup is the primary driver for indexing.
Most lookups for this application are equality lookups: we are searching for specific startups that meet certain criteria, then populate their attributes based on equality relationships. For this reason most indexes will be hash indexes. The exception here is with regards to addresses and people: columns in these tables are indexed using B+ tree indexes to facilitate range searches.
The database used for this application is PostgreSQL, which automatically indexes all primary keys and all columns with unique constraints. The default PostgreSQL indexing method is B+ tree. By default, PostgreSQL's automatic indexing is not be case insensitive. The 'citext' extension has been applied to the database so that these indexes will behave in a case-insensitive manner, and all text fields have been defined as citext.
addresses
located_at
people
employed_at
funding_rounds
investors
startups
The queries necessary to create and the database and populate it with data from .csv files is located in the create_db.sql file.
CSV files containing test data have been created for each table:
The following queries support the application's core functionality and are contained in the application_queries.sql file.
SELECT table_name FROM information_schema.tables WHERE table_schema = 'public';
select startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude,
count(funding_rounds.amount) as rounds,
sum(funding_rounds.amount) as total_funding
from startups
join located_at on startups.name = located_at.startup
join addresses on located_at.street_address = addresses.street_address
join funding_rounds on startups.name = funding_rounds.startup
where addresses.zipcode='94107'
group by startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude;
select startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude,
count(funding_rounds.amount) as rounds,
sum(funding_rounds.amount) as total_funding
from startups
join located_at on startups.name = located_at.startup
join addresses on located_at.street_address = addresses.street_address
join funding_rounds on startups.name = funding_rounds.startup
where startups.name = 'Yelp'
group by startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude;
select startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude,
count(funding_rounds.amount) as rounds,
sum(funding_rounds.amount) as total_funding,
employed_at.role,
employed_at.from_date,
employed_at.to_date
from employed_at
join people on people.id = employed_at.people_id
join startups on startups.name = employed_at.name
join located_at on startups.name = located_at.startup
join addresses on located_at.street_address = addresses.street_address
join funding_rounds on startups.name = funding_rounds.startup
where people.first_name='Bob' and
people.last_name='Goodson'
group by startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude,
employed_at.role,
employed_at.from_date,
employed_at.to_date;
select startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude,
count(funding_rounds.amount) as rounds,
sum(funding_rounds.amount) as total_funding,
investors.name
from investors
join funding_rounds on funding_rounds.investor = investors.name
join startups on funding_rounds.startup = startups.name
join located_at on startups.name = located_at.startup
join addresses on located_at.street_address = addresses.street_address
where investors.name = 'Angelrush Ventures'
group by startups.name,
startups.vertical,
startups.since,
addresses.street_address,
addresses.latitude,
addresses.longitude,
investors.name;
The following development tools were used in creating this project:
David Westgate is a Computer Science student at University of San Francisco.
Distributed under the MIT License.