Integrating line planning, timetabling, and vehicle scheduling: a customer-oriented heuristic

Given an existing public transportation network, the classic planning process in public transportation is as follows: In a ﬁrst step, the lines are designed; in a second step a timetable is calculated and ﬁnally the vehicle and crew schedules are planned. The drawback of this sequence is that the main factors for the costs (i.e. the number of vehicles and drivers needed) are only determined in a late stage of the planning process. We hence suggest to reorder the classic sequence of the planning steps: In our new approach we ﬁrst design the vehicle routes, then split them to lines and ﬁnally calculate a (periodic) timetable. The advantage is that costs can be controlled during the whole process while the objective in all three steps is customer-oriented. In the paper we formulate an integrated model from which we develop this new approach, discuss the complexity of the resulting problems, and present a heuristic which we applied within a case study, optimizing the local bus system in Göttingen, Germany.

The classic planning phases in public transportation (left) compared to the sequence used in this paper (right) In this paper we deal with the following three steps: line planning, timetabling, and vehicle scheduling.
Given a public transportation network PTN = (V , E) which is an undirected graph consisting of a set of stops or stations V and a set of links between them, we are hence looking for lines, their frequencies, a (periodic) timetable and for the vehicle schedules. Our goal is to provide a public transportation system which is as attractive as possible for potential passengers. In our case study we measure attractiveness by the time a passenger has to wait for a connection and by the time for traveling by public transport (which we compare to the time for traveling by car).
In the following we will talk about stops and buses only. However, the idea of our approach can also be transferred to commuter railway systems or long-distance railway planning. Before formulating our problem we sketch the single planning steps line planning, timetabling and vehicle scheduling in their classical order and review what has been done in these fields so far. Note that we do not consider crew and duty scheduling in this paper, neither in our approach nor in the literature review.
Line planning A line l is a path in the public transportation network PTN. The frequency f l of a line l says how often service is offered along line l within a (given) time period T . A line concept is a set of lines together with their frequencies.
In most research papers it is assumed that a line pool of potential lines is already given. The goal is to choose a set of lines from the pool and to assign frequencies to the lines chosen. Even the feasibility problem (finding frequencies such that the constraints at each edge are satisfied) is NP-hard (see Bussieck 1998;Claessens et al. 1998).
One distinguishes between cost-oriented models (see e.g. Claessens et al. 1998;Zwaneveld 1997;Goossens 2004;Bussieck et al. 2004;Goossens et al. 2006) in which the line concept has to cover a given demand with smallest possible costs, and customer-oriented models where a budget is given that should be used in a way that is "best" for the passengers. Examples for customer-oriented objective functions are to maximize the number of direct travelers (Bussieck et al. 1996;Bussieck 1998) or to minimize the traveling time of the passengers (see Borndörfer and Pfetsch 2006;Schöbel and Scholl 2006;Scholl 2005). A model for simultaneous optimization of transit lines and passenger line assignment in a general network is presented in Guan et al. (2006). Designing lines which can compete with the private mode has been studied in Laporte et al. (2007Laporte et al. ( , 2005. Note that Claessens et al. (1998) already considered an approximation of the costs of the vehicle schedules.
There are rather few papers in which the line pool itself is constructed. In the very first paper about line planning, Patz (1925) starts with a line for each OD pair and iteratively eliminates lines by a greedy approach. A similar greedy heuristic is due to Sonntag (1977). More recently, Pape et al. (1995) and Quak (2003) suggest constructive approaches, while Borndörfer and Pfetsch (2006), Borndörfer et al. (2007) present an exact model in which routes are constructed within the optimization. Line planning aspects are also touched in Liebchen and Möhring (2007) where within a timetabling approach pre-defined line segments are combined to lines. The approach we propose in this paper can be classified as a constructive heuristic based on a customer-oriented approach.
Timetabling Given the set of stops V and the set of vehicles F , a timetable consists of two functions π arr : V × F → N, π dep : V × F → N assigning a departure time and an arrival time to each vehicle at each stop. To avoid indices event activity networks are used in timetabling (see e.g. Nachtigall 1998) in which the events consist of all arrivals and departures of all vehicles at all stops. The events are linked by edges corresponding to different types of activities, the most important are driving activities of vehicles between stops, waiting activities of vehicles at stops, and transfer activities to account for passengers changing buses or trains. To account for capacity issues, headway activities are added.
We have to distinguish between periodic and aperiodic timetabling. If the order of the events is fixed, the latter can be efficiently solved by shortest path techniques. If events appear periodically, an ordering is not possible in this sense. This is one reason why the periodic case is NP-hard (for a formal proof see Nachtigall 1998). The basis for tackling periodic timetabling is the periodic event scheduling problem (PESP) originally introduced in Serafini and Ukovich (1989). There are many extensive studies about timetabling, we refer to Peeters (2003), Liebchen (2006) and references therein. Current approaches deal with integration aspects (e.g. Liebchen and Möhring 2007 where the periodic event scheduling problem is extended to allow dealing with vehicle scheduling and line planning aspects) or robustness issues (Kroon et al. 2007;Fischetti et al. 2007). In our approach we are looking for a periodic timetable.
Vehicle scheduling If the lines and the timetable are given one can define trips, i.e. minimal paths which have to be operated by the same vehicle (usually between start and end stop of a line). For each trip its start stop with its departure time and its end stop with its arrival time are given. Two trips trip 1 and trip 2 can be served by the same bus if the arrival time at the end stop of trip 1 plus the time needed to drive from the end stop of trip 1 to the start stop of trip 2 is smaller than the departure time at the start stop of trip 2 . The goal is to find a cost-minimal assignment between buses and trips such that each trip is covered by exactly one bus and the schedules of all vehicles are feasible. This results in the physical vehicle routes together with a timetable for each of the vehicles. Note that a vehicle route describes the path of a single bus which might very well serve several lines.
While the multi-depot case is NP-hard (see Bertosi et al. 1987 andPepin et al. 2006 for a comparison of different heuristics), the single-depot case can be solved polynomially. Approaches include decomposition models (Saha 1972), assignment models (Orloff 1976), transportation models (Gavish and Shlifer 1978) or network flow models (Daduna and Paixao 1995). A recent survey paper dealing with bus scheduling is Bunte and Kliewer (2009), railway issues are treated in Maróti (2006).
Recent research in vehicle scheduling includes route constraints (e.g. Kliewer et al. 2006), or maintenance issues. Robustness issues are considered within the framework of ARRIVAL (Arrival 2006(Arrival -2009.
Our contribution In contrast to the approaches in the literature and to the classic planning process in public transportation, we follow a new approach in this paper. Instead of determining the lines in a first step, we start by determining a route for each vehicle. These routes are interpreted as lines in a second step. In the third step we finally add a timetable for each vehicle.
Our approach is based on the observation that both objective functions, namely cost aspects and passengers' aspects can already be computed if the vehicles' routes and schedules are known. This means that we can consider both objectives throughout the whole planning process.
Since we are looking for a periodic schedule we assume that one common period T is given after which everything is repeated. We plan for only one period but take the periodicity into account when evaluating our objective functions.

An integrated model based on vehicle routes
Let a public transportation network PTN = (V , E) be given. For each edge e = {i, j } in the PTN let d(e) be the time a bus needs for driving between stops i and j . A vehicle route is the path of a vehicle (i.e. of a specific bus) in the PTN. It can be given as a sequence of stops in V or as a sequence of edges e ∈ E. A route is called circular if its first and last stop are the same. The duration of a route is defined as the sum of all edge lengths of the route, i.e.
The main idea of our new approach is to start the whole process by designing the vehicle routes. In order to obtain a periodic timetable we restrict ourselves to circular routes whose durations are close to a multiple of the time period T .
The set of all routes in the final public transportation system is denoted by U . For each vehicle route u ∈ U let f u denote its frequency specifying how many trips are offered along the route within the period T . (Note that the route frequency may differ from the line frequency, e.g. if several routes serve the same line.) The schedule t u assigns an arrival and a departure time to each stop of the route.
As we will show in the following, the values (U, f, t) are sufficient as variables, i.e., the objective function and the constraints can be evaluated if U and f u , t u are known for all u ∈ U . A solution of the problem will hence be denoted by (U, f, t).
We now describe the constraints we consider.
• The costs of a public transportation system mainly depend on the number of vehicles in use. This number determines not only the investment costs but also the number of duties to be covered and hence the costs for planning crew schedules and rosters. Our budget constraint hence bounds the number of buses N that we are allowed to use. We only construct circular routes u with a duration dur(u) satisfying that dur(u)f u is a bit less than an integer multiple z u ∈ Z of the time period T , i.e. we require for some given slack η > 0. Although this restriction might cut off good solutions, it is of great importance since it ensures that we can easily keep track of the number of vehicles needed: If dur(u)f u is close to z u T the operation of route u requires exactly z u buses. (Note that this is a special case in which the vehicle scheduling heuristic of Claessens et al. (1998) gives the optimal solution.) Consequently, we require • Within our solution approach we also take into account that there is enough space available for buses at each of the stops. As parameters we have given a capacity cap(v) indicating how many buses are allowed to be at the stop v at the same time. This constraint depends not only on the vehicle routes, but also on their timetables. • Note that there are usually more constraints in practice. These include breaks for the drivers, slack times to make the timetable more robust and constraints for the specific shape and structure of the lines. We do not mention them explicitly in our model but they are considered when constructing the vehicle routes in the first phase (Sect. 5.1) of our algorithm.
As objective function we chose the attractiveness of the public transportation system. Since we are not only interested in improvements for existing customers but also in attracting new customers, we use an origin-destination matrix representing the complete demand. The data available is usually not based on stops, but is given due to demand regions (called cells). It does not make sense to split this demand to each single stop (e.g. one at each side of the street), so we aggregated the set of stops V to a set of locations B, where the latter usually is a set of stops with the same name. (In most cases two stops on either side of a road form a location.) We then generated an origin-destination matrix OD ∈ Z |B|×|B| based on the locations. For each pair i, j ∈ B of locations the value OD ij hence represents the number of persons who want to travel from i to j , i.e. the potential number of customers for this OD-pair.
We define the attractiveness of a public transportation system as the average probability p ij that a (potential) traveler between locations i and j decides to use public transportation instead of the private mode. This probability depends on many factors. Usually, a passenger will compare possible journeys in the bus system with the alternative of using a car. In order to calculate the attractiveness we hence have to do the same. Fortunately, this can be done if (U, f, t) is known since the set of possible journeys P ij a passenger can use between i and j is already determined by the vehicle routes and their schedules. It can be calculated by shortest path algorithms in an appropriate timetable graph defined by the PTN and the solution (U, f, t) to be evaluated, see (Bauer et al. 2007) for a recent comparison of methods. Note that footpaths connecting nearby stops are also taken into account to model the fact that passengers walk from one stop to another.
Formally, our objective function hence is given as where OD ij is the potential demand between locations i and j and p ij (U, f, t) is the probability that a person who wants to travel between stops i and j uses the public transportation system (U, f, t). We assume that this probability depends on the set of journeys P ij . In our case study, we considered the number of possible journeys and their durations compared with the private mode in order to estimate the probability p ij (see Sect. 3). Note that the idea to compare the traveling times in public and private mode has also been used by Laporte, Mesa and Ortega, see Laporte et al. (2005). Summarizing, we deal with the following problem.
(P) Given a PTN = (V , E) with edge lengths d(e) for each e ∈ E, a set of locations B and an origin-destination matrix OD, a function p ij (U, f, t) evaluating the users' behavior (based on P ij ), a time period T , and an integer N, find a solution (U, f, t) with less than N buses maximizing att(U, f, t).

Case study
As an illustration of the model presented in the previous section we describe our case study which we used within a cooperation with Göttinger Verkehrsbetriebe (GÖVB), the local bus company of Göttingen, Germany.
The following data was known: The public transportation network consists of 485 stops; the demand data was divided among 248 locations. The capacity of most of the stops is equal to four, i.e. in most cases not more than four buses can be at the same stop at the same time. It turned out that this is a crucial constraint: If left out we always obtained timetables in which up to ten buses stopped simultaneously at the same stop.
As edges we used all edges contained in already existing lines, but we also added further edges representing streets which are currently not used by buses, see Fig. 2. The driving times of the new edges were determined in cooperation with GÖVB. We also added footpaths between stops.
In order to estimate the traveling time in the private mode (which we used to compare with the traveling time using the bus system in the objective function), we added additional edges which are not suitable for buses (e.g. if the streets are too narrow). The edge lengths d priv (e) in the private mode are usually shorter than in the public mode, i.e. in most cases we have d priv (e) ≤ d(e). Exceptions are streets in the city center where we added additional time to account for the time-consuming task of finding a parking slot. The same can be done for streets with dedicated bus lanes or priority signals.
As demand data we received a partition of Göttingen into regions, called cells, and data about the demand for each pair of cells. We assigned locations to cells, estimated the importance of each location and expressed this by weights. Then we distributed the demand data to pairs of locations according to their assignment and weights.
An analysis of the current system showed its advantages and drawbacks: The driving times from the outskirts to the center are rather small. Moreover, twice an hour, many transfers are possible at one of the central stops. On the other hand, the The network with all possible edges in Göttingen capacity of this central stop is exceeded such that buses sometimes have to leave before the transferring passengers have arrived. We also noted that there are often long breaks at the end stops of the lines (up to 20% of the duration of the route).
Evaluating the attractiveness of a solution Let (U, f, t) be a solution of problem (P). In the following we show how we specified and computed the objective function att(U, f, t) within our case study. Talking to practitioners we decided to focus on the following two values in order to determine the probability that a person decides to use public transportation for his or her journey from i to j : pw ij : the average waiting time before the journey from i to j is started, pd ij : the travel time of public transport between i and j compared to the travel time of the private mode.
Next, we show how to estimate pd ij and pw ij , based on the set of all possible journeys P ij from i to j . Our goal is to identify paths p ∈ P ij with small traveling time. We hence collect = time needed to travel from i and j using path p.
We determine the minimal traveling time, and fix a value λ to obtain G ij = p ∈ P ij : dur(p) ≤ λ · dur min ij and there does not exist any path p ∈ P ij satisfying as the set of "good" journeys between i and j . Based on G ij we estimate pd ij and pw ij as follows: pd: We compare the travel time in public transport with the travel time using the private mode, i.e. we calculate denotes the average travel time in public transportation and private ij is the travel time in the private mode. The probability for two parameters α 1 and α 2 . pw: We determine the average waiting time wait ij until the next trip in G ij starts.
To this end, we sort the journeys in G ij according to dep(c) to obtain a list We assume that the demand is distributed evenly within a period, i.e. at each minute we have the same probability that a person wants to start his or her journey. If a person arrives within interval I k , his or her average waiting time is minutes. Hence we estimate as the average waiting time for the next journey from i to j . Again, the probability that a customer accepts the average waiting time is modeled by a piecewise linear function (see right picture of Fig. 3) depending on the parameters β 1 and β 2 .
Assuming that the probability pw ij to accept the average waiting time is independent of the probability pd ij to accept the travel time ratio, we finally get and are hence able to compute att(U, f, t) according to (3).
Note that the two functions depend on the customers' behavior which is represented by the parameters α 1 , α 2 , β 1 , β 2 and λ.
In our case study these parameters are set to • α 1 = 1.1, α 2 = 2.5 meaning that everybody accepts an increase of 10% of the travel time, but nobody would accept an increase by the factor 2.5, • β 1 = 7.5, β 2 = 36, i.e. an average waiting time of 7.5 minutes (referring to a connection offered four times an hour) is accepted by all potential passengers, while an average waiting time of more than 36 minutes is not accepted at all. For public transportation at night we increased these values to 10 and 45. • Due to (4), λ also has an influence on the probability p ij . In our case study we chose λ = 1.3.
Note that the specific values for the parameters have been chosen after discussion with practitioners. They make sense for the local properties of Göttingen, but need not hold in other environments. For example, in large cities, we suggest to choose smaller values for β 1 and β 2 .
City center requirement There is one more special requirement that we had to take into account in our case study: It was required that all routes pass through the city center. This condition is reasonable since the demand between two non-central locations is rather small (according to the data of Göttingen we had and as expected due to gravity models). Note that this requirement significantly reduces the set of possible vehicle routes and hence the set of feasible solutions (but not the complexity of the problem as we will show in the next section). In the following let us denote the stops of the city center by Cen.

Complexity
Problem (P) of planning lines, a timetable and the vehicle schedules simultaneously is NP-hard. This result is not very surprising since even the single planning steps are already known to be NP-hard. However, also the case of our case study in which all vehicle routes pass through one specific stop is NP-hard and this even holds if all frequencies have to be one, if the passengers accept public transportation whenever a journey is offered, and if the timetable is not relevant. It still holds if we do not distinguish between location and stops. We denote this problem as (P-special). Formally it is defined as follows. • s c ∈ u for all u ∈ U , • f u = 1 for all u ∈ U , • e∈u d(e) ≤ z u T for some z u ∈ Z for all u ∈ U and u∈U z u ≤ N (i.e. it can be run with N buses) and such that • att(U, f, t) = (i,j )∈B×B p ij (U, f, t)OD ij ≥ U ? Proof We use a reduction from the knapsack problem which is known to be NP-hard (see Garey and Johnson 1979). Given two natural numbers W, B and a set of items B with weights w(b) ∈ N and benefits v(b) ∈ N for all b ∈ B, does there exist a subset K ⊆ B of items with a total weight of no more than W and a total benefit of at least B?
Given an instance of (Knapsack), an instance of (P-special) is to be constructed. and zero for all other pairs. For the customers' behavior we use the simplest possible objective function, namely This means that all existing paths are accepted by the passengers, independently of their timetables or other characteristics. Finally, we define N := W and U := B. We now show that (P-special) has a feasible solution if and only if (Knapsack) has a feasible solution.
(P-special) ⇒ (Knapsack): Let a feasible solution for (P-special) be given with a set U of routes. Every route contains the central stop s c and at least one other stop. Without loss of generality we can assume that the route contains exactly one other stop (otherwise we split it to feasible routes for each other stop s b it contains, since the length to reach s c and back is already at least T and all durations are integer multiples of T ). We define u b := (s c , s b , s c ) as the route passing through stop s b . We now show that is a feasible solution of (Knapsack): • The route u b takes 2 · w(b)·T 2 = w(b)T time. Hence, in order to operate this route with a frequency of one, w(b) buses are necessary (see p. 215). Since the solution U is feasible for (P-special) we conclude that b∈K w(b) ≤ W.
• On the other hand, we know that the OD-pair (s c , s b ) will use public transportation whenever s b ∈ u for some u ∈ U , i.e. whenever b ∈ K. Hence • s c ∈ u for all u ∈ U .
Hence U is feasible for (P-special) and the proof is finished.
From the previous theorem we directly obtain that (P) is NP-hard even if the frequencies of all vehicle routes have to be one and if the timetable is not relevant, i.e. if f u = 1 for all u ∈ U and att(U, f, t) does not depend on t. It can also be shown that (P) is NP-hard if the vehicle routes and their frequencies are given and only the new timetable has to be found, i.e. if U and f have been fixed. We refer to Michaelis (2007) for a proof.

Solution heuristic
Our approach to solve (P) is the following: Phase 1: Design the routes U and their frequencies f u . Phase 2: Split the routes to lines. Phase 3: Find a timetable t for the routes.
From Theorem 4.1 and the remark at the end of the previous section we know that Phase 1 and Phase 3 are both NP-hard even in the case of our case study in which all vehicle routes are required to pass through a set of specific nodes. Phase 2, however, is nothing but a graphical representation of the system, since the shape of the lines themselves has no influence on the attractiveness or on the costs of the transport system (since not the lines, but only the vehicle routes are needed to calculate the costs or the possible journeys for the passengers). Phase 2 can hence also be done after the timetabling step.
In the following we present heuristic algorithms for all three phases. Note that some of the ideas we used were motivated by the special requirements of Göttingen (this will be mentioned in the text), but all of them can easily be adapted to other cities.

Phase 1: Finding the vehicle routes with their frequencies
Each route is a circle in the public transportation network PTN. In Phase 1 we construct such circles and then combine them to a complete set of vehicle routes.
The basic idea of constructing one single vehicle route is simple: We first specify a duration, then we start with some (arbitrary) stop and add other stops until the route has the duration we specified. Formally, we perform the following steps:

Construct a single vehicle route
Input: η > 0, integer z u , frequency f u , T = period Step a. Start with an (arbitrary) stop s := s u and the route u = (s).
Step b. Randomly take a new stop, neighbored to one of the stops of route u. Add it to route u if dur(u ∪ {s}) ≤ T z u −η f u . Repeat. If the condition is not satisfied, goto Step c.
Step c. Add the slack time T z u f u − dur(u) to the edge lengths of u to obtain a duration of exactly T z u f u .
Step d. Output u, dur(e) for all e ∈ u.
In Step b we obtain a route with maximal duration but satisfying (1), i.e. dur(u) < z u T −η f u . It is desirable that η is small, but not zero such that some additional slack time is available for each route. Such time can be used to provide slack times at stops in order to enable passengers to change to other buses, or more general, to make the timetable robust against delays. It may also be needed for breaks for the drivers at the end stops. For each route this additional time η is distributed to the edge lengths in Step c. We propose to add it to stopping times at stops where transfers are likely or to the stops farthest away from the center at turnaround activities. Note that much more sophisticated approaches for distributing slack in periodic timetables exist, see Kroon et al. (2007), Liebchen and Stiller (2009).
Theoretically, we can perform the above construction completely at random. In order to obtain reasonable routes it however makes sense to further guide the procedure. To this end we suggest the following additional rules for Step b. Let U be the set of routes already found.
• Stops that have not been covered by any other route of U should be more likely to be chosen to ensure that we finally obtain a set of routes covering all stops. We hence weight stops in order to increase the probability that a stop is chosen if it still does not appear in other routes. In our case study, we derived good results by weighting the unused stops by a factor of three. • Circles within the routes should be avoided: This can be done by taking a new stop with a higher probability if it is not already in the route. (This rule is certainly not applied for the starting node s u .) • It may be desirable that routes contain most of their edges forward and backward, i.e., that they have a similar shape in inbound and outbound directions. To enforce this we suggest to consider only such routes in which the number of locations that consist of more than one stop but only have one stop in the route is small. • In our case study we require that all vehicle routes contain a stop v ∈ Cen of the city center. We hence choose such a stop as start stop s u which significantly reduces the search space. • In Göttingen we also implemented the following rule: Let us call a maximal part of a route starting and ending at a stop v ∈ Cen a branch. The public transportation company in Göttingen did not want to have routes with four or more branches. We took this into account by deleting all routes that visited the city center more than four times. • Many other rules to model specific requirements are possible.
The algorithm of Phase 1 is as follows. In each step we choose (randomly) an integer z u and a frequency f u as parameters. Then we construct a set of h routes fitting to these two parameters. We evaluate these routes and choose the best. Then new parameters z u , f u are chosen and the step is repeated until all vehicles are used. Note that the correct evaluation of the attractiveness requires a timetable which is not at hand during the first phase. Hence we estimate the objective function by a rough approximation: We set the departure times at the stop s u (from which we started) to zero. As we will detail in Phase 3, the complete schedule of a route is fixed by only one of its departure times since we already distributed the slack times in Step c. We consequently obtain the following procedure for designing one candidate set of routes.

Phase 1: Design of a set of routes for N buses
Input: Parameters h ∈ Z, (small) η > 0, T = period, N = number of buses allowed.
Step 1.2: Choose a frequency f u and a (small) integer z u .

Phase 2: Designing the lines
If the vehicle routes have been fixed we can represent them as lines. A line is a path through the PTN which is operated by only one bus. Hence each part of a route can be considered as a line. As lines are usually organized as tours it is preferable to take sub-circles of the routes.
As mentioned before, the representation by lines has no effect on where and when the buses drive and hence no effect on the objective function. Consequently, we can define the lines such that we get a "nice layout".
As mentioned before, in Göttingen, all routes have to pass through the city center. Moreover, no route is allowed to contain more than three branches. We hence chose branches or combinations of pairs of branches as lines, see Fig. 5 for an illustration. These branches naturally are sub-circles of the routes.
Algorithmically, we can proceed as follows.

Phase 2: Splitting routes to lines
Input: U Step 2.1: For each route u ∈ U : Decompose U into circles. Choose the circles or unions of circles as lines. As input for Phase 3 we have given a set of routes U with their frequencies f u , u ∈ U . Our goal is to construct a feasible timetable. According to our constraints, a timetable is feasible if the capacity of no stop is exceeded. Given the period T we choose a timetable within the discrete set of points {0, 1, . . . , T } (usually minutes). The timetable is then repeated periodically. The periodicity is taken into account when evaluating our objective function att. We already fixed the slack times of the edges in Step c when constructing the routes, hence fixing one departure time at one stop determines the complete schedule of the route. We use the stop s u from which we started to construct route u. A timetable is hence given as a vector t ∈ {0, . . . , T } |U | . A timetable is feasible if at no point in time more than cap(v) buses are at stop v for all v ∈ V . We denote the set of feasible timetables with T . We call a feasible timetable t optimal if Note that the resulting model is similar to Domschke (1989), Daduna and Voss (2001) where it is solved as quadratic semi-assignment problem. However, in our case the number of passengers using a transfer is not fixed beforehand but determined by routing the passengers in each step. We hence propose the following iterative matching approach.
Consider a route u with frequency f u and departure time t u at stop s u . Then another departure of the same route will take place at t u + z T f u for all integer values of z. Hence we only need to evaluate departure times t u ∈ {0, 1, . . . , T f u }. But even with this reduction it is not possible to try all possible combinations of departure times. Since finding an optimal timetable in Phase 3 is NP-hard we propose to use a heuristic. Our first idea to fix the departure times of each route iteratively had the following drawback: We obtained routes, all departing at the same time from the same central stop. When the capacity of this stop was used, the next routes were placed very disadvantageous leading to a non-favorable overall solution.
We hence developed the following approach. We divide the routes into pairs and synchronize each pair in a first step. In a second step we combine the pairs to quadruples and again synchronize them. We proceed in this manner until all routes are fixed. During this process we choose the pairs in each step by matching techniques to ensure that the most promising (feasible) combinations are grouped.
Formally, we define the following graph G match = (U, E match ) in which the nodes represent the routes U and we add an edge between two routes u 1 , u 2 if they contain at least one stop where a transfer is possible. To this end we have to check if u 1 ∩ u 2 = ∅ and if the capacity of the stops u 1 ∩ u 2 is large enough to allow that both buses stop there. As weight for edge {u 1 , u 2 } we set c u 1 ,u 2 := max t 1 ,t 2 ∈T att({u 1 , u 2 }, {f u 1 , f u 2 }, (t 1 , t 2 )), i.e., we choose the best possible synchronization of the two routes (independent of all other routes). Since one of the two times t 1 , t 2 can arbitrarily be fixed we only have to evaluate c u 1 ,u 2 := max t∈T att({u 1 , u 2 }, {f u 1 , f u 2 }, (0, t)).
( 6) We then choose a matching with maximum weight in the graph G match and synchronize the matched pairs of routes. Each of the pairs (or of single routes if the matching was not a perfect matching) is then clustered to one new node for the matching graph of the next step. Edges between the clustered nodes are introduced when a transfer between the two groups of routes is possible. We again chose the weight as the best possible attractiveness which we can obtain when synchronizing the two groups. (Note that also here only such schedules are taken into account that respect the capacities of the stops.) We again determine an optimal matching in the clustered graph and repeat the process until only one group is left.
To state the algorithm we need to deal with groups of routes g ⊂ U . Synchronizing such a group of routes means to find a timetable t g := (t u : u ∈ g) for all routes u ∈ g. Note that such a timetable can be shifted in time without changing its objective value, i.e.
att(g, f, t g ) = att(g, f, t g + t) where f = (f u : u ∈ g) and t g + t = (t u + t : u ∈ g). We can hence assume without loss of generality that there is one representative route u g in each group g with t u g = 0.
Given two groups of routes g 1 and g 2 with two timetables t g 1 and t g 2 . If we want to synchronize these groups (without changing their internal timetables) we have to find max t∈T att(g 1 ∪ g 2 , (f u , u ∈ g 1 ∪ g 2 ), (t g 1 , t + t g 2 )) The optimal value for t is denoted as t * g 1 ,g 2 and called the synchronization shift. Our algorithm starts with a first partition into groups, each group consisting of only one route. In each step, the groups are matched pairwise. (Some groups may be left unmatched if the matching is not perfect.) The procedure can be summarized as follows.

Input:
U , f u for all u ∈ U .
Step 3.4: Find a matching E m ⊆ E match maximizing the sum of weights.
Step 3.5: Update groups: For each e = {g 1 , g 2 } ∈ E m define g := g 1 ∪ g 2 and • V match = V match ∪ {g} \ {g 1 , g 2 } • u g = u g 1 as representative route of group g • t g = (t g 1 , t * g 1 ,g 2 + t g 2 ) as timetable of group g using the synchronization shift calculated before. • Goto Step 3.2

Numerical results
We implemented our procedures and tested them within a case study in Göttingen using a PC Intel Centrino (1.4 GHz). We had two different data sets: The night bus system (from 7:30 pm to midnight) and the system at daytime (from 5:30 am to 7:30 pm). Both used the same PTN and the same number of cells and locations. The period T equals 60 minutes in both cases.
When constructing the vehicle routes we chose z u ∈ {1, 2, 3} and f u ∈ {1, 2}. Hence (1) typically leads to routes with a duration of 60, 90, 120, or 180 minutes; η has been fixed to 10% of T f u . In each iteration of Phase 1 we generated h = 1000 candidate routes in Step 1.3 using different heuristics and different search strategies. In Step 1.4 the best of these routes was chosen. We repeated the process until the given number N of buses was attained, where N = 23 for the system at night and N = 46 for the system at daytime.
In the timetabling step, the first iteration of our matching approach resulted in pairs of routes, almost all of them being synchronized at one of the central stations. In the next steps, groups were still synchronized at central stations as long as the capacity of these stations was not exceeded. Otherwise, the locally best alternatives of synchronizing at other stations or other times were identified.
Note that we followed two different goals within the system at night and the system at daytime. At night we had given N = 23 buses and tried to construct an attractive night bus system for Göttingen. At daytime we did not aim to maximize the attractiveness but the goal was to optimize the costs. This can be done by decreasing the number of available buses step by step until the attractiveness of the new system gets too small. In Göttingen we obtained a reduction of 10% of the buses still having an attractiveness of 1% more than before. The two solutions which are best according to the practitioners of GÖVB are listed in Table 1. They consist of 10 lines for the night system and of 15 lines for the system at daytime.
The nighttime solution improved the attractiveness of the current solution by 18.7%. We now analyze this new plan in more detail: The new timetable does not have the long breaks at the ends of the lines. The saved additional buses are used to increase the frequencies of the routes. Moreover the new solution is more robust due to the distribution of the slack times and it takes  Lines with frequency 1 Lines with frequency 2 Current system 7 5 New system 3 7 the capacities of the stops into account. To illustrate the differences we zoomed into the northeastern part of Göttingen where some typical differences between the current system and our solution can be seen. The left picture in Fig. 6 shows potential edges E. In the middle, the current night bus system is shown and in the right part our new solution. One can see that major parts of the lines remained as they were. One line was removed, but the suburb it connects was added to another line. The reason is that the intermediate stops (which are now not covered) have nearly no demand such that the bus could be used in another part of the city more efficiently. Moreover, the two lines crossing each other have been changed and now follow better their shortest paths.
On a more global view it turns out that none of the new lines stayed exactly as they were in the old solution, but the major difference comes from a new combination of the different branches in the city center. If we neglect this and look at the single branches between the city center and the outskirts, we find nine branches which are almost identical to the old plan, three branches with minor changes and seven branches with major changes. Also the number of lines with a frequency of two increased while the number of lines with a frequency of one decreased (Table 2).

Conclusion
In this paper we presented a new approach to tackle three well-known problems in public transportation, namely line planning, timetabling and vehicle scheduling. We did not use the classical sequence of the planning phases but started by constructing the vehicle routes. The advantage is that costs can be controlled during the whole process while a customer-oriented objective can be followed. A drawback might be that our reordering does not allow straightforwardly to integrate vehicle and crew scheduling as can be done in the traditional approach. This is an issue of further research.
Two phases of our approach, namely, constructing the routes and fixing the timetable are NP-hard. We hence suggest heuristic solutions. Our decomposition yields new problems that are currently under investigation from a theoretical point of view. We are sure that improvements and further results about optimality gaps and quality of heuristics can be made in both phases.
From an experimental point of view we are currently developing a library LinTim (Schachtebeck and Schöbel 2009) which is able to perform the different planning steps in public transportation on the same example scenarios consecutively. Its goal is to evaluate lines, timetables, and vehicle schedules in an integrated way. In our future work we plan to use LinTim to compare our new approach with the traditional sequence of algorithms. Moreover, we are analyzing integration of the different planning steps in star-shaped networks with a simple passengers' structure.
Finally, from a practical point of view, our approach proved to be successful. Some components of our solution are already implemented in Göttingen and seem to perform well; a complete new system is under research.
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.