Skip to content

TRIAS example extraction

Milan Straka edited this page May 17, 2021 · 14 revisions

This page describes the process of the TRIAS single mobility request extraction (trips belonging to a single search for transportation possibilities between two points), together with ridesharing data generation and enrichment of examples with TSP data. Note, that to run these scripts, you need access to trias-generator-xml-examples private repository (storing privacy-sensitive TRIAS examples) or a TRIAS XML file.

Single mobility request extraction

For T3.1 we got provided with an example of TRIAS data, containing offers for searches for transportation between two points, each stored as TripResults. As all the 559 TripResults (each representing a single trip) in the provided TRIAS XML example example-offers.xml were provided in a single file. They were stored in the same trip search element but happened on different days and in different locations. Therefore, we performed an analysis of this file and found similarities among the trips. Hence, we decided to group the trips based on two similarity conditions, i.e. trips belonging to a single mobility request:

  • should start and end at the same stop point = same Start and End StopPoint (or Address)
  • should have start time within +- 1 hour

This was done using the code mobility_request_generation_TRIAS.py. This way we obtained 129 single mobility requests (grouped TripResults), and from these, we have chosen 11 examples based on used transport modes and the counts of TripResults. Each of these single mobility requests is stored in a an XML file in the directory basic_examples, and coactive:UserId in each single mobility request ends with its number, e.g. for sing_mob_exmpl_3.xml the UserId is: '8d6ba330-fefd-44ef-87a5-exmpl3'.

  • Note, that if you do not have access to Ride2Rail/trias-generator-xml-examples private repository containing these examples, some of these links will not work. This repository stores sensitive data, therefore, is private.

Ridesharing data

To enrich the examples with RideSharing (RS) data, which was missing in the example, we have chosen 3 extracted single mobility requests {sing_mob_exmpl_1.xml, sing_mob_exmpl_3.xml, sing_mob_exmpl_10.xml}, and generated ridesharing (RS) data {rs_exmpl_1.xml, rs_exmpl_3.xml, rs_exmpl_10.xml}. The XML schema used for ride-sharing is mainly inherited from plain TRIAS xsd schema. Only the nodes within the “coactive:” namespace (e.g. OfferItemContext) belong to the extended schema.

The RS data was added the following way: We took a trip from a single mobility request example and changed one or two TripLegs to a single RS leg as (ContinuousLeg) based on the agreed form from the RS xsd scheme. We added suffix 'rs' to each UserId of ridesharing single mobility request, e.g. '8d6ba330-fefd-44ef-87a5-rs-exmpl3'. Each leg has its Id with the file ID appended, e.g. rs_exmpl_3.xml has a ridesharing leg with id 'RS-leg-id-3'. Also, the OfferId and TickedId for ridesharing both end with the suffix 'rs1'. The 3 ridesharing enriched TRIAS XMLs are in the directory RS_examples.

For more information please read "R2R description.docx" and ridesharing-extended-xml-examples section of the trias-generator-xml-examples repo.

Code structure and running

Here, we describe the code for single mobility request generation from the provided TRIAS example, RS data addition and TSP data generation. To run these scripts, you need to add xml_examples directory to the root of this directory.

Single mobility request extraction

Description of mobility_request_generation_TRIAS.py

First, we split the large XML TRIAS example into TripResults (totally 559) using the following code:

trias_big_root = etree.parse("../xml_examples/example-offers.xml", parser=parser).getroot()
trip_res_list = trias_big_root.findall(".//ns3:TripResult", NS)

We iteratively merge the tripResults, if they have equal first and last stops (or addresses) (start_stop_equal) and if they start +-1 hour from the first tripRequest (time_within_range).

merged_trip_res = [[trip_res_list[0]]]
# merge the trips having similar start time and same first and last stop into a list
for trip_res_1 in trip_res_list[1:]:
    merged = False
    for merged_group in merged_trip_res:
        trip_res_2 = merged_group[0]
        if start_stop_equal(trip_res_1[1], trip_res_2[1]) and time_within_range(trip_res_1[1], trip_res_2[1]):
            merged_group.append(trip_res_1)
            merged = True
            break
    if not merged:
        merged_trip_res.append([trip_res_1])

As the next step we examined the groups of trips belonging to a single mobility request, we picked 11 of them, which will serve as examples - stored in a list _sel_merged_trips

selected_merged_trips = [4, 6, 16, 17, 24, 29, 38, 41, 60, 63, 11]

sel_merged_trips = [merged_trip_res[i] for i in selected_merged_trips]

For each of these trips we create an XML structure:

ggg_parent = etree.Element('{http://www.vdv.de/trias}Trias', nsmap=NS)
gg_parent = etree.Element('{http://www.vdv.de/trias}ServiceDelivery', nsmap=NS)
g_parent = etree.Element('{http://www.vdv.de/trias}DeliveryPayload', nsmap=NS)
parent = etree.Element('{http://www.vdv.de/trias}TripResponse', nsmap=NS)

ggg_parent.append(gg_parent)

# append the user info
for t_el in trias_big_root[0]:
    if t_el.tag == '{http://www.vdv.de/trias}DeliveryPayload':
        break
    gg_parent.append(deepcopy(t_el))

gg_parent.append(g_parent)

Next, we write each TRIAS example belonging to a single mobility request. As first, we add locations and addresses to the beginning of the file, then we copy the TripResults and write the file into xml_examples directory, with names _sing_mob_exmpl_i_.xml, where i is the number of example.

for i, mer_trip_res in enumerate(sel_merged_trips):
    parent = etree.Element('{http://www.vdv.de/trias}TripResponse', nsmap=NS)
    g_parent.append(parent)
    # get stop references from trip list
    my_loc_list = get_loc_list(mer_trip_res)
    # get list of full locations
    stop_ref_list = get_stop_ref_list(my_loc_list, location_dict)
    loc_element = create_Locations_element(stop_ref_list)

    parent.append(loc_element)
    # deepcopy the trips into parent TripResponse
    for trip_res in mer_trip_res:
        parent.append(deepcopy(trip_res))
    # change the user ID according to the example
    change_user_id(gg_parent, i)
    et = etree.ElementTree(ggg_parent)
    et.write('../xml_examples/basic_examples/sing_mob_exmpl_' + str(i) + '.xml', pretty_print=True,
             xml_declaration=True, encoding='UTF-8', standalone='yes')
    # remove reference from parents
    parent.getparent().remove(parent)

If you change the merged_trip_stats variable to True, the script will also output the statistics about the trips, tickets, transport modes etc.

Enrichment of TRIAS examples by ride-sharing

For generation of ridesharing data were used examples sing_mob_exmpl_1.xml, sing_mob_exmpl_3.xml, sing_mob_exmpl_10.xml using code from ridesharing_generator.py and manual amendments.

The generation of RS data just follows the agreed xsd schema about ridesharing data. The trip containing the RS data is then printed.