Skip to content

combine_gtfs_feeds documentation

stefancoe edited this page Oct 13, 2021 · 21 revisions

Overview

combine_gtfs_feeds is a command line tool to combine multiple gtfs feeds into a single feed/dataset. The main purpose of combine_gtfs_tools is to be able work from one GTFS feed when performing transit service analysis for a particular geographic location. The Puget Sound region, for example, has 7 different transit agencies and each publish their own GTFS feed. We (PSRC) use GTFS for lots of analytical, mapping and network modeling applications. For example, we may need to find all the transit stops in the region that have frequent service and then determine the population or number of jobs within a certain distance from those stops. We often rely on other python packages to do this work, but starting from one unified GTFS feed for the region makes this process a lot easier.

Installation

First clone the repository, then navigate to the root directory and enter the following in a command prompt:
python setup.py install
This will install combine_gtfs_feeds in your current python environment.

Command line arguments:

Arguments:
combine_gtfs_feeds <run> -g <input_dir> -s <service_date> -o <output_dir>

combine_gtfs_feeds is the entry point to the module.

run is a sub-command argument that was added because we hope to add the ability to download gtfs feeds, which would require it's own sub-command and arguments.

  • -g, --gtfs_dir The location of the folders containing the GTFS files from each feed. Each feed should be stored in it's own folder, which should all be in the same directory. Each folder should be named something to clearly identify the feed it contains. For example, a good name for the folder holding Sound Transit's feed might be 'ST'. This name will be pre-appended to each ID (route_id, stop_id, trip_id & shape_id) in the output GTFS files to uniquely identify the origin feed and prevent duplicate IDs.

  • -s, --service_date The date in YYYYMMDD format that represents the service the combined GTFS will represent. Each feed in the --gtfs_dir musht have at least one service id that includes this data or the program will exit. The idea here is to pick a date that is typical of the service you wish to analyze. For example, we use a non holiday Tuesday in May to represent weekday spring service. Note, the output of this program will only include service for this date. The program must be run independently for each date of interest.

  • -o, --output_path The location of the resulting GTFS feed. This will include the following GTFS files and log file:

    • calendar.txt
    • routes.txt
    • trips.txt
    • stop_times.txt
    • stops.txt
    • shapes.txt
    • agency.txt
    • run_log.txt

Example:
combine_gtfs_feeds run -g c:/gtfs_folder -s 20210914 -o c:/output_folder

Clone this wiki locally