What Petri Net Obliges Us to Say: Comparing Approaches for Behavior Composition

We identify and demonstrate a weakness of Petri Nets (PN) in specifying composite behavior of reactive systems. Specifically, we show how, when specifying multiple requirements in one PN model, modelers are obliged to specify mechanisms for combining these requirements. This yields, in many cases, over-specification and incorrect models. We demonstrate how some execution paths are missed, and some are generated unintentionally. To support this claim, we analyze PN models from the literature, identify the combination mechanisms, and demonstrate their effect on the correctness of the model. To address this problem, we propose to model the system behavior using behavioral programming (BP), a software development and modeling paradigm designed for seamless integration of independent requirements. Specifically, we demonstrate how the semantics of BP, which define how to interweave scenarios into a single model, allow avoiding the over-specification. Additionally, while BP maintains the same mathematical properties as PN, it provides means for changing the model dynamically, thus increasing the agility of the specification. We compare BP and PN in quantitative and qualitative measures by analyzing the models, their generated execution paths, and the specification process. Finally, while BP is supported by tools that allow for applying formal methods and reasoning techniques to the model, it lacks the legacy of PN tools and algorithms. To address this issue, we propose semantics and a tool for translating BP models to PN and vice versa.


INTRODUCTION
T HE linguistic relativity hypothesis says that the languages we speak influence our worldview or cognition. While early linguistics believed that language determines thought, it is now commonly accepted that language influences only certain cognitive processes in non-trivial ways [1]. Deutscher [2] for example, says, "when your language routinely obliges you to specify certain types of information, it forces you to be attentive to certain details in the world and to certain aspects of experience that speakers of other languages may not be required to think about all the time." The linguistic-relativity hypothesis has been a guiding principle for computer languages, from early to modern ones, that were designed to direct programmers to change and adapt their thinking to the way machines "think". There are many examples: Iverson argued that notations aid in thinking about computer algorithms [3], Matz says that he was inspired by this hypothesis when creating the Ruby language [4], and many more [5], [6]. While linguistics researchers have moved to "softer" versions of the theorem, in software engineering, to the best of our knowledge, computer-languages researchers are still guided by the early version of the theorem, with one exempt, as discussed below.
In this work, we follow Deutcher and demonstrate how the Petri net (PN) language routinely obliges users to specify things that they do not wish to specify, resulting in unexpected complications and even incorrect specifications. Specifically, we show how the attempt to specify multiple requirements in one PN model forces users to specify also mechanisms for combining these requirements, resulting in • All authors are with Ben-Gurion University of the of the Negev, Israel. E-mail: {achiya,geraw}@bgu.ac.il, tomya@post.bgu.ac.il over-specification and possibly incorrect models, as some execution paths are missed and some are generated unintentionally.
Petri net (PN) is a modeling language with formal semantics that allow for both executing the model and analyzing it. The formal semantics differentiate PN from other process and behavioral modeling languages, such as activity and sequence diagrams. The ability to synthesize the model into working software and analyze it makes it commonly used for modeling and programming discrete event systems (DES) -dynamic systems with discrete, potentially infinite, state space. A comprehensive introduction to PN can be found in [7].
In Section 9 we survey behavior-composition approaches for PN that have been proposed over the years. Nevertheless, these approaches require modelers to consider all the mutual dependencies directly. As we demonstrate in this paper, this may not be feasible in some cases.
To support our claim on PN, we begin with an analysis of a known DES benchmark, called level crossing, and demonstrate the inaccuracy of several PN models for this benchmark. To address this problem, we propose a different modeling and programming paradigm, called behavioral programming (BP), that allows for a direct specification, execution, and verification of requirements. Like PN, BP is supported by tools for applying formal methods and reasoning techniques to the model. Nevertheless, to keep the legacy of PN tools and algorithms, we propose tools for translating BP models to PN and vice versa. As we will show, our approach has the following advantages: • It supports a modular specification approach where each module isolates a specific aspect of the system behavior.
• It allows modelers to specify the behavior only, exempting them from specifying mechanisms of combining the behaviors. • It allows for applying formal methods and reasoning techniques for analyzing and verifying different properties of the system behavior, such as reachability, liveness, boundedness, etc. • These algorithms can be executed in a compositional way, thus handling large-scale programs. • We present transnational semantics from the BP model to the PN and vice versa. • This translation supports current PN-based practices and algorithms. Furthermore, to avoid the necessity of verifying properties of both the BP model and the PN model, we propose an algorithm and a tool for comparing the two models and testing their equivalency. Thus, our approach allows PN modelers to verify the requirements' correctness and the alignment between the requirements and their implementation.
PN has many extensions and variations. In this paper, we compare BP to the basic PN formalism and arguably the most familiar one. Nevertheless, we discuss some of these extensions in Section 9.
The rest of the paper continues as follows. Section 2 gives a short primer on behavioral programming, followed by a general description of the level-crossing benchmark in Section 3. We model the benchmark requirements with BP in Section 4 and with PN in Section 5. In Section 6, we provide an algorithm for comparing the two models and use it for demonstrating how the mechanism specification in PN results in incorrect behavior. In Section 7, we provide translational semantics between the two models. In Section 8, we complete our analysis with more PN models and quantitative comparison between BP and PN. We conclude the paper with a survey of related work (Section 9) and a short discussion (Section 10).

MING
The behavioral programming paradigm focuses on constructing reactive systems incrementally from their expected behaviors [8], [9]. When creating a system using BP, developers specify a set of scenarios that may, must, or must not happen. Each scenario is a simple sequential thread of execution and is thus called a b-thread. B-threads are typically aligned with system requirements, such as "train may not enter when barriers are up". The set of b-threads in a model is called a behavioral program (b-program). During runtime, an application-agnostic execution engine interweaves all bthreads participating in a b-program, yielding a complex behavior consistent with all said b-threads. As we will show, this execution engine exempts the modelers from specifying how the requirements interact.
BP is interesting from the linguistic-relativity perspective. Instead of directing its users to a particular way of thinking, the main design goal of the paradigm is precisely the opposite. It aims to enable modelers to specify reactive systems' behavior in a natural and intuitive manner that is aligned with how they perceive the system requirements. To address this goal, several extensions to the paradigm have been proposed to improve this alignment and remove the necessity of specifying mechanisms [9], [10], [11]. Also, user studies measured the naturalness and intuitiveness of the paradigm, compared to other paradigms [12], [13].
Harel, Marron, and Weiss [23] proposed a simple protocol for b-thread synchronization, as follows. The protocol consists of each b-thread submitting a statement before selecting each event that the b-program produces. The statement declares which events the b-thread requests, which events it waits for (but does not requests), and which events it blocks (forbids from happening). After submitting the statement, the b-thread pauses. When all b-threads have submitted their statements, we say that the b-program has reached a synchronization point. Then, a central event arbiter selects a single event that was requested and was not blocked. Given this event, the arbiter resumes all b-threads that requested or waited for that event. The rest of the bthreads remain paused, and their current statements are used in the next synchronization point.
To make these concepts more concrete, we now turn to a tutorial example of a simple b-program. The example presented in this section is an adaptation of one of the first demonstration programs presented in [23] (the hot/cold example). For convenience and succinctness of the specification, the b-programs in this paper are written using BPjs -an environment for running behavioral programs written in JavaScript [17]. While the b-programs specification may be considered programming rather than modeling, the same program can be specified using diagrammatic implementations of the BP paradigm, including live-sequence charts [24] and Blockly [10]. Moreover, b-programs can be translated to PN models and vice versa, as described in Section 7.
The example: Consider a system with the following requirements: 1) When the system loads, do 'A' three times.
2) When the system loads, do 'B' three times. Listing 1 shows a b-program (a set of b-threads) that fulfills these requirements. It consists of two b-threads, added at the program start-up. One b-thread, namely Do-A, is responsible for fulfilling requirement #1, and the second b-thread, namely Do-B, fulfills requirement #2.
The program's structure is aligned with the system requirements. It has a single b-thread for each requirement, and it does not dictate the order in which actions are performed (e.g., the following runs are possible: AABBAB, or ABABAB, etc.). This is in contrast to, say, a single-threaded JavaScript program that must dictate exactly when each action should be performed. Thus, traditional programming paradigms are prone to over-specification, while behavioral programming avoids it.
While a specific order of actions was not originally required, this behavior may represent a problem in some cases. Consider, for example, an additional requirement that the user detected after running the initial version of the system: 3) Two actions of the same type cannot be executed consecutively.
While we may add a condition before requesting 'A' and 'B', the BP paradigm encourages us to add a new b-thread for each new requirement. Thus we add a b-thread, called Interleave, presented in Listing 2. Listing 2: A b-thread that ensures that two actions of the same type cannot be executed consecutively, by blocking and additional request of 'A' until the 'B' is performed, and vice-versa.
The Interleave b-thread ensures that there are no repetitions. It does so by forcing an interleaved execution of the performed actions -'A' is blocked until 'B' is executed, and then 'B' is blocked until 'A' is executed. This is done by using the waitFor and block idioms. Note that this bthread can be added and removed without affecting other bthreads. This is an example of a purely additive change, where the system behavior is altered to match a new requirement without affecting the existing behaviors.
Recall the discussion in the introduction regarding Deutcher's concept of languages that oblige people to specify things that they do not wish to specify. A critical reader may suspect that BP obliges users to specify unnecessary information for guiding the execution protocol. To answer this, we note that the BP protocol for composing behaviors is implicitly defined and is not part of the model. This protocol is aligned with an implicit protocol that already exists in requirement documents. Each requirement specifies a single aspect of the behavior, and it does not concern itself with other behaviors, though it is clear that the requirements are related to each other. Thus, the implicit protocol of BP does not force the modeler to specify unnecessary information, only use assumptions that already exist in the requirements.

THE LEVEL-CROSSING BENCHMARK
We now turn to describe the level-crossing benchmark that we will use throughout the following sections.
The level-crossing (LC) domain was first presented in 1987 by [25] and modeled with Petri nets (PN). It was later used in various research areas of PN modeling and software safety analysis [26], [27], [28]. Although some of these works modified the original model to pertain features to the relevant study, they all followed the same general behavior of [25].
Levenson and Stolzy [25] defined the model as a controller for a gate at a railway crossing -an intersection between a railway line and a road at the same level. The railway line has a sensor that signals the controller whenever the train is approaching, entering, or leaving the crossing zone. Based on the signals, the barriers are raised and lowered, ensuring the safety of the trains, i.e., that a train cannot be in the crossing zone while the barriers are up.
While the system behavior is not explicitly specified as a set of requirements, we have extracted the following requirements as we understand them, and we will later refine them: 1. The railway sensor system dictates the exact event order: train approaching, entering, and then leaving. Also, there is no overlapping between successive train passages. 2. The barriers are lowered when a train is approaching and then raised as soon as possible. 3. A train may not enter while barriers are up. 4. The barriers may not be raised while a train is in the intersection zone. The intersection zone is the area between the approaching sensor and the leaving sensor. At system initialization, there is no train at the intersection zone, and the barriers are raised.
We note that these requirements specify the behavior that the controller should enforce, though they do not specify the implementation details. As we will show, our BP implementation will follow this distinction and keep the alignment between the requirements and the model. However, the PN model will add a mechanism for combining the behaviors that will cause incorrect behavior.

MODELING THE REQUIREMENTS WITH BP
To emphasize the agility of BP models, we begin with a specification that handles only one railway, and we will later extend this model to support multiple railways and faults.
Following the principles of BP described in Section 2, each b-thread in Listing 3 is aligned to a single requirement of the system.
The first requirement, describing the order of the sensor's events, is specified in the first b-thread. It continuously requests to "approach", "enter", and "leave", dictating this specific order. We note that a new cycle can start only if the previous train has left the intersection zone, which is aligned with the requirement of no overlapping between successive train passages.
The second b-thread specifies the second requirement of the barriers behavior. It waits for a train to approach and Listing 3: A BP program that specifies the requirements for a single railway. Each b-thread is aligned with a single requirement. An application-agnostic execution engine interweaves these b-threads at runtime, yielding a complex behavior that is consistent with each b-thread, liberating the designer from explicitly specifying the joint model. then requests to lower the barriers. When the barriers are down, it requests to raise them as soon as possible. We note that the two barriers events, Lower and Raise, can only happen consecutively, aligned with the system behavior description. Requirement 3 is specified by the third b-thread, which blocks the train from entering while the barriers are up. The first synchronization point blocks the train from entering before lowering the barriers. The second synchronization point ensures that if the barriers are raised between the approaching and the entering events, then the behavior returns to its initial state.
Finally, the last b-thread specifies Requirement 4, blocking the raising of the barriers while there is a train in the intersection zone.
This model demonstrates some merits of the BP modeling approach. The system was modeled in an incremental and modular manner, where each module is aligned with a single requirement and is unaware of other b-threads. We claim that the resulting modules are readable and comprehensible to all stakeholders.

MODELING THE SYSTEM WITH PN
In this section and in Section 6, we present three PN models for the LC domain. We begin with the original model from 1987 of Levenson and Stolzy [25] and continue with the two models of Ghazel and Liu [27] from 2016. As we demonstrate below, all of these models are incorrect, as some execution paths are missed and some are generated unintentionally.
The original model of [25] is composed of three types of subsystems: railway traffic, barriers, and a barriers' controller. To comply with the specified behavior of the entire system, the modelers specified a mechanism to combine these subsystems. As we discuss below, this mechanism changes the behavior of the model, causing unpredictable side effects.
The railway-traffic subsystem (depicted in Figure 1) specifies the dynamics of the railway using three places and three transitions, corresponding to the sensor's events: approaching, entering, and leaving. The index of these events denotes the railway index, though for now, we have only one.
The barriers subsystem (depicted in Figure 2) has two states -up and down (marked by p 7 and p 8 respectively). The barriers passively respond to the commands issued by its controller that we now describe.
The barriers' controller subsystem (depicted in Figure 3) provides an interface between the railway traffic and the barriers subsystems. A closing request is fired when a train approaches, and when a train leaves, an opening request is fired. Note that this subsystem contains two interlocks, p 2 and p 3 , which together make sure that closing request and opening request fire alternatively. In practice, this means that the barriers may be closed if and only if they are open.
Finally, to address Requirement 3 and forbid the train entrance while the barriers are up, the unified model that integrates the three subsystems (depicted in Figure 4) includes an additional interlocking state, p 9 , and its arcslower → p 9 and p 9 → entering.
We note that the controller events closing request and opening request are not mentioned in the requirements. The reason is that Levenson and Stolzy [25] designed the PN model as a controller where these events act as part of the implementation of the controller. We argue, as we show below, that it is better to model the requirements separately of the implementation. We mark these implementation events as helper events, since they are not required The unified PN LC model of [25], including the three subsystems and the interlocking mechanism. for specifying the system behavior, only for the specific implementation perspective. We show below that the helper events and the interlocking mechanism lead to undesired system behaviors.

COMPARING THE BP AND PN
In this section, we compare the BP and the PN models to verify our model's correctness and explicate the differences between the models.

Models equivalency
We begin with a definition of equivalence. Generally, two models are equivalent if they yield the same set of runs, i.e., the same sequences of events. However, there is a complication in our case since the PN model requires helper events that are not part of the BP model. For example, the two traces in Table 1 are equivalent in system behavior, though they have different events. As the example shows and noted before, these events are used to synchronize the barriers and the railway events.
They are not required to compare the resulted behavior of the two models. Thus, our equivalency definition ignores these events. For completeness, we allow helper events on either side of the comparison. Definition 1. Models M 1 and M 2 over the event sets E 1 and E 2 , respectively, are equivalent if and only if where L(M i ) is a set of sequences of events, called traces, that model M i generates, and π E1∩E2 (t) is an operator that removes from a trace t all the events that are not in E 1 ∩ E 2 : To allow traces with finite length, we also define that π E (ε) = ε for any E. The sequences in the sets L(M i ) can have a finite or infinite length.
Using this definition, we denote M BP and M P N as the BP model and the PN model (respectively). Since the trains may infinitely approach, enter, and leave the crossing zone, we use Büchi automata to represent the languages L(M BP ) and L(M P N ). A Büchi automaton consists of a set of states and a transition function, where some states are defined as accepting and some as initial (starting). The automaton accepts input if and only if there is a run over this input that begins at an initial state, and at least one of the infinitely often occurring states is an accepting state.
We generate the automata using a depth-first search that traverse the state space of each model, where transitions correspond to events and all states are accepting (depicted in Figure 5). Thus, the accepting words of these automata represent the set of possible traces that each model may generate.
To understand the significance of the difference between the two models, we analyze them using GOAL [29] -a graphical tool for manipulating Büchi automata and temporal formulas. Our findings show that the resulting language for the BP model is contained in the resulting PN model -L M BP ⊂ L M P N , meaning that some runs are only possible in the PN model. One example for a word (or a trace) that is accepted only by the PN model is: In this case, there are two trains on the same railway, where the second train approaches the intersection zone after the first train leaves while the barriers are down. According to this trace, even though a train is already approaching the barriers, the latter may be raised only to be lowered again immediately after. As we describe in Section 6.3, [27] added the keep down event to avoid this behavior, though it did not completely prevent it and caused other problems to the model. Thus, these redundant barriers actions are not aligned with the system requirements and do not stand to reason. We believe that such behavior is derived from their specific mechanism implementation. While the BP paradigm allows for a direct specification of the requirements as separate modules and their automatic composition, the PN modeling approach obliges the modeler to explicitly specify how the different modules interact.

Adjusting our model
To allow a simple comparison of the models, we now adjust our model to meet this behavior. We start with a redefinition of Requirement 2: "The barriers should be lowered after a train approaches. If the barriers were already lowered, then they should be raised, and immediately lowered again before the train enters the intersection zone".
Granted, this is a strange and tangled requirement, though it describes the observed behavior.
This adjustment requires the modification of the second b-thread. The original b-thread and its modified version are presented in Listing 4. Once a train leaves, the controller requests to raise the barriers while waiting for another approaching event, whichever comes first. If the event is approaching, it requests to raise the barriers. Otherwise, it waits for a train to pass again. In addition to this change, the fourth b-thread should be removed, to allow the raise of the barriers after the train approaches.
Given these modifications, the BP and the PN models are now equivalent.

Expanding to Multi-Track
Ghazel and Liu [27] observed this redundant behavior and tried to address it. In addition, they extended the model to support multiple railways. Many have used this extension as a benchmark for this domain [30], [31], [32], and we use it to continue our comparison. Figure 6 presents the extended PN model of [27] and our extended BP model. Extending the BP model required only multiplying the b-threads by the number of tracks. Railway-specific events (i.e., Approaching, Entering, and Leaving) were assigned with an index while the barriers events remained the same. Since the behavior of each b-thread is valid for both the single version of the system and the multi-track version, we needed no further adaptations to the extended model. The extended PN model of [27] significantly changed the model and its semantics. To support multi-track, they multiplied the railway traffic subsystem and changed the other two subsystems as follows: two additional arcs were added (p 6 → closing request and raise → p 6 ), tokens were added, and some arc weights were changed.
To address the redundant raise-lower behavior, the model adds a helper event, called keep down, and its related arcs. These additions allow to keep down the barriers if a train approaches right after another one leaves. Alas, not only that their solution did not solve it in all cases, but it created other problems, as we present in Section 8.1.
Applying the comparison algorithm of Section 5 reveals that the extended PN model is, again, not aligned with the requirements and is no longer equivalent to our model. In Section 8, we further analyze the differences between these models.

Adding Faults
In real life, discrete-event systems may have faults. For example, the entering sensor on a railway may be faulted and miss a train entering. Detecting and diagnosing such faults at real-time is of paramount importance in DES modeling and has become an active research area in recent years. The research activity in this area is driven by the needs of many different error-prone domains. When modeling, faults are often added to the basic model that describes the standard system behavior. This may lead to inconsistent system behavior that is misaligned with prior requirements. In Section 8.1, we give multiple examples for such inconsistencies. Here, we demonstrate how our suggested method can assist modelers in verifying and analyzing the impact of faults on the initial requirements.
In the PN model detailed in [27], two classes of faults were added for diagnosis purposes (denoted with red transitions in Figure 7). The first one simulates a train-sensing defect and indicates that the train enters the level-crossing zone without triggering the entering sensor. Thus, the train may enter before the barriers are lowered. The second   failure indicates a defect of the barriers that result in a premature raising.
A detailed look at their model, reveals that the arcs to and from p 9 are not required for modeling the fault transitions. In fact, they were added as part of a mechanism for satisfying the original requirements given the new fault transitions. As we show in Section 8.1, these arcs fix one behavior and break others, blocking many legit traces that can no longer happen. This example, together with the multi-track extension, demonstrate the drawback of PN for modeling behaviors -adding new behaviors after the model is ready often requires modifying the mechanism and performing non-trivial adjustments to the model. It obliges modelers to think of all the side effects of these adjustments, something that may not be practical for complex behavior of large-scale systems.
The BP version of the fault transitions is presented in Listing 5. The first class of faults, simulating an unobservable train entering, is modeled using n b-threads, one for each railway. Each b-thread waits for a train to approach and then requests a signed entering event, representing the fault. The second b-thread models the second fault class of a premature barriers' raise. The b-thread waits for the barriers to be lowered. It then requests to raise the barriers using a signed event that represents this fault. We note that the fault events in our BP model are similar to the original non- Listing 5: Faults b-threads for the level-crossing benchmark.

A Conclusion for the Level-Crossing Domain
The addition of multiple tracks and fault transitions demonstrate the dynamical and incremental development style of BP. The described b-program is modular because new requirements can be flexibly added as new b-threads. Since modeling often begins without faults, we believe that the ability to implement them without modifying (or even accessing) the existing model is a significant advantage. Furthermore, it maintains the alignment of the model with requirements. System requirements, both old and new, can be represented directly using a BP model.
At this point, a critical reader may ask whether the BP model is indeed error-free or has some unknown problems. We used the verification tool of BPjs to assert some properties of the model, such as that it has no deadlocks or livelocks. To validate that the model is indeed aligned with the requirements, we sampled the generated traces of the model and checked that they are aligned with the way we perceive the requirements.
To conclude the different versions of the LC benchmark, we present in Section 8.1 a quantitative comparison between the PN and the BP models for the LC benchmark. This comparison emphasis the effect of the problems in the PN model on the resulted behavior. To the best of our knowledge there are no other references in the literature to these problems in the LC benchmark.

TRANSLATIONAL SEMANTICS FROM PN TO BP
BP has tools for executing and analyzing the models (e.g., using formal methods). In addition, there are tools for converting BP models into other formats, such as Z3, GOAL, SPIN, Graphviz, and more. Nevertheless, PN has longstanding successful tools that may be more suitable for different use cases. Thus, we propose an approach for PN modelers to use BP to bridge the gap between system specification and PN implementation. System requirements can be specified directly using BP and implemented using PN. Hence an equivalence between the two models, in such a case, can indicate an alignment between requirements and implementation. This approach "eases" the transition for PN modelers and maintains the current advantages of PN. Although the automaton required for the comparison can be computed directly from PN, we now show a viable alternative that is more useful in practice -a direct translation from PN to BP. Tools based on these semantics can provide a uniform modeling environment where all modeling artifacts are translated to a common language.
Taking advantage of BP's modularity and flexibility, we can implement the dynamics of each place of the PN model as a separate b-thread. Each b-thread maintains the number of tokens in the place using a variable. Based on the number of tokens, it waits-for or blocks a set of events that represents the transitions to/from the place. A translation example for p 2 is presented in Listing 6. If it has no tokens, it forbids the event "Closing Request" from taking place while waiting for an "Opening Request" event, which increases its tokens. If it has some tokens, it waits for both events and increases or decreases its tokens accordingly. The suggested translation is general and can be applied to all places of a PN model. The complete b-program combines all places b-threads and an auxiliary b-thread that requests all possible events at each round (as depicted in Listing 7). This program yields a behavior consistent with the entire PN dynamics.
Based on these semantics, we translated to BP all the PN models in this paper. To verify the correctness of each translation, we generated the automaton of the PN model using SNAKES [33], a Python library for Petri nets. Next, we generated the automaton for the translated (to BP) model. Finally, we verified that the models are equivalence, using the method described in Section 6.1. The paper's repository (github.com/bThink-BGU/Papers-2022-BP-PN) contains the source code for all models (PN, translated-PN-to-BP, and BP) and their automata. In Section 8.1, we empirically evaluate the differences between the BP and the PN/translated models. Although a translation from BP to PN is not necessary for the context of this paper, it can be easily done. An automaton representing the behavior of the model, such as the one depicted in Figure 5, can be automatically generated (elaborated in Section 6.1). This automaton can be viewed as a special case of a simple PN with a single token passing from states (or places).

RESULTS
The example of the level-crossing benchmark allowed us to demonstrate our claims on PN. The purpose of this section is to further establish our claims in two ways: 1) quantifying the differences between PN and BP, and; 2) demonstrating our claims on other PN models to support our hypothesis that the problem is rooted in the language constructs.
All the code examples in this paper and the data we used for comparing the models can be viewed at github. com/bThink-BGU/Papers-2022-BP-PN.

Empirical Results for the Level-Crossing Benchmark
To quantify the differences between the two approaches, we computed the state space (i.e., reachability graph) of the PN and the BP programs, with and without failures. We ran each program with a varying number of railways. To evaluate the effect of the helper events on the state space, we removed all transitions s e − → t, where e is a helper event, and rewired all incoming transitions of s into t. We denote the resulting state space as PN*.
The results are summarized in Table 2. For comparison, we use the BP model without the Requirement 2, since it better aligns with the requirements. We implemented the PN model of [27] using the translational semantics presented in Section 7. Notably, for any n with faults, the number of states and transitions of the PN model matches the reported numbers of [27], thus validating our implementation (they did not provide statistics for other implementations). To make the differences between the models accessible to the reader, in Table 3, we present the state and transition reduction between the different models. For all models, the average reduction in states is similar to the average reduction in transitions. From PN to PN*, the model is reduced by almost 50%; from PN* to BP, the model is reduced by almost 70%, and finally; from PN to BP, the model is reduced by approximately 80%. This dramatic reduction allows for applying reasoning techniques and formal methods on larger models compared to PN. Furthermore, the composable structure of BP programs allows for applying compositional verification and compositional formal method techniques [19].
In Table 4, we evaluate the equivalency of the models to compare the possible traces of each model (as described in Section 6.1). Unsurprisingly, the number of paths is exponentially bound to the size of the state space. Therefore, we generated only traces of a maximal length of sixteen and only for a limited number of railways due to memory limitations. This comparison shed much light on the differences between the models, as we now elaborate.
In Section 6, we showed that for a single track without fault transitions, the resulting language of the BP model is contained in the resulting language of the PN model (i.e., L M BP ⊂ L M P N ), meaning that some traces are only possible in the PN model. As Table 4 shows, this phenomenon is relevant for multiple tracks as well. To recall, the PN model allows for a redundant raise and lower actions of the barriers after the train approaches.
The inclusion of fault transition to the models increased the models' differences. In addition to traces unique to the PN model, the introduction of fault transitions resulted in unique runs to the BP model. One example for a trace that is accepted by the BP model only is: (Approaching · F aultEntering · Leaving) ω . In this case, a train enters the crossing zone before the barriers are lowered and then leaves. A train should be able to leave the crossing zone regardless of the barriers state. To understand why this trace is not possible in the PN model with faults, we recall the mechanism (described in Section 6.4 for complying with the original requirements, which adds arcs to and from p 9 . This mechanism caused an unexpected side effect. Another trace that is accepted only by the BP model is: (Approaching · Lower · Entering · Leaving · Approaching · F aultRaise · Entering·) ω . Here, the barriers were raised, although the system was unaware of this event. Then a train entered the crossing zone. While such behavior is reasonable and may happen, this trace cannot happen in the PN model, revealing another side effect of the mechanism.  1  10  13  6  8  5  6  1  20  43  13  26  13  26   2  83  185  37  87  13  26  2  142  500  85  284  39  126   3  483  1,532  168  537  35  101  3  832  4,085

Additional Petri-Net Models
We now turn to demonstrate our claims on two other domains to support our hypothesis that the problem is rooted in the language constructs.

The Dining Philosophers
The famous dining philosophers problem has been modeled using PN in many papers and was also modeled in BP [18].
In Listing 8, we present the BP implementation for this problem and the PN model in Figure 8. We took the PN model from a tutorial for Workcraft -a framework for interpreted graph models, supporting modeling, verification, and synthesizing such models [34], [35], [36]. Both of the implementations define the two basic behaviors of the system, one for philosophers and one for the forks. Both implementations may cause the same two problems -a deadlock and starvation. These problems can be defined as two additional liveness requirements: 1) A picked-up forked will eventually be put down, and; 2) a hungry philosopher will eventually eat. While both PN and BP have tools for detecting liveness problems, BP allows developers to directly specify the liveness requirements, as presented in Listing 9. A hot synchronization point specifies that whenever the b-thread arrives at this point, it will eventually proceed (i.e., one of the requested or waited-for events will be selected).
BP offers several approaches for automatically enforcing a correct execution in terms of liveness, including synthesis, runtime look-ahead, and reinforcement learning [37]. Another option to enforce the execution correctness is explicitly implementing known solutions, like resource ordering and a central arbitrator. Resource ordering in BP can be achieved using priorities. Here, the priority of the philosophers' requests is inversely proportional to their index. We demonstrate the use of priorities in the following example (see Listing 11). The central arbitrator solution is presented in Listing 10, where philosophers must get a hold on a central semaphore to eat (i.e., take and put the forks). The literature also offers PN solutions [38], which we do not present for brevity. Nevertheless, like our previous examples, they require the mechanism specification for interweaving all of the requirements together. Listing 9: Liveness requirements for the dining philosophers. A hot synchronization point specifies that whenever the b-thread arrives to this point, it will eventually proceed.

Tic-Tac-Toe
The last domain we present is the game of Tic-Tac-Toe. The game is of particular interest for us as it was developed using BP in one of the earliest papers of the paradigm [23]. The requirements of the game are well known and the BP implementation of the game has been published long before this paper. Thus, this domain stands as a touchstone for our hypothesis.
We take the PN model from [39], which was used to create a domain-specific language (DSL) for the game. As depicted in Figure 9, this PN model only specifies two requirements of the game: 1) a cell can be marked only once, and; 2) turns -X and O play in turns where X starts. The mechanism for integrating these requirements is specified without the use of helper events and is fairly understandable. Nevertheless, there are two additional requirements: 3) The first player to get three marks in a line is the winner, ending the game, and; 4) if no player has won and all nine squares are marked, then the game is over with a tie. Although according to this model, the game ends upon the marking of the last cell, there is no tie declaration (i.e., transition). This is where the model gets complicated. Since [39] did not create a complete model of the game, we took this mission on ourselves. In Figure 10, we added an interlock mechanism for handling the case that Listing 11: A behavioral program for the game of Tic-Tac-Toe, taken from [23]. Lines 1-20 specify the same behavior as the PN model of Figure 9, and the b-thread in line 29, with l="first row", specifies the same behavior as the PN model of Figure 10.  [39]. The model includes a mechanism for terminating the game when X wins by taking the first row.
X wins by placing Xs in the first row. The "game token" is added to ensure that the game will stop once the token is gone (i.e., the game ends). The place "row 0 X counter" waits for three "X" tokens to arrive and then fires them to the "row 0 X win" transition, together with the "game token". Since the transition has no outgoing edges, it acts as a sink that terminates the game. In total, we added 23 edges, two places, one transition, and one token. Since the new model was extremely noisy and hard to understand, we decreased the opacity of the original specification. Handling the other seven lines will require additional 35 edges, seven transitions, and seven places. Adding a tie event with a lower priority than a winning event requires an additional interlocking mechanism.
Listing 11 presents a BP implementation of the game, taken from [23]. Each b-thread represents a single aspect of the game and is unaware of other aspects. For example, the last b-thread repeatedly asks for placing X and O at any of the nine cells. It is unaware of other rules, like turns, which are enforced by the second b-thread. Cell and line b-threads are duplicated for each cell/line. There is one exception to the separation of concerns between the b-threads -the 90 and 100 numbers in the winning b-threads. These numbers represent the priority of the event. If a game has both nine moves and the last mark wins the game for the X player, then the 'XWin' event overcomes the 'Tie' event. Notably, the Tie requirement refers to the winning requirement, thus technically, the separation of concerns is violated in the requirements, and the alignment is kept. Nevertheless, there are several solutions to this issue (e.g., using context [9]), though they are out of the scope of this paper.
Notably, the b-threads in lines 1-20 specify the same behavior as the PN model in Figure 9. Similarly, the b-thread in line 29, with l="first row", specifies the same behavior as the PN model in Figure 10. This comparison emphasizes the conciseness of the BP model compared to the PN model.

RELATED WORK
Giua and Silva [40] pointed out that while the use of PNs with state specifications is a very mature area, their use in the design of systems from general behavioral specifications has not been equally successful. The latter issue can in some cases lead to incorrect specifications, faulty implementations, and inconsistent system behavior. Finding a more general approach to system modeling is still an open problem driving several developments in the PN field.
The ability to structurally define the entire system behavior as a function of the behavior of its subsystems is a key factor in designing systems from a general behavior description. Hence there has been much research concentrating on PNs compositionality and sub-PNs interactions representation. Several works introduced a compositional extension of PNs using process algebras [41], [42], [43]. They provide an approach for a high-level description of interactions, communications, and synchronizations between PNs. In another work, Baldan et al. [44] represented PNs interactions by introducing open PN, a generalization of the ordinary model. In open PN, some places, designated as open, reflect interaction with other nets. Concretely, an open place can function as an input or an output (or both), meaning that external PNs can put or remove tokens from it. Kindler and Petrucci [45] proposed a similar approach that adds an interface for each module, called channel, that specifies the input and the output of the module. All approaches require the definition of an interface between the different components, thus improving their abstraction (as with interfaces of object-oriented programming). Nevertheless, the modelers are still required to consider the mutual dependencies for specifying these interfaces. We argue that the BP approach addresses compositionality more naturally, with the ability to compose behaviors without direct consideration of mutual dependencies.
An important part of generalizing PN modeling is the ability to represent an interface of the system with the environment. Plain PNs are not adequate to model systems that can interact with their environment or, in another view, are only partially specified. The above-mentioned open PN [44] can also model external interaction, where some nets in the whole model represent the system's environment. In another related work, reactive PN [46] addresses this issue by defining reactive semantics to PN, specifically, splitting the set of transitions into internal and external and modifying its firing rules. These semantics state that if an external transition is enabled, it may fire, while in contrast, internal transition, when enabled, must fire. Such behavior is desired in systems that are specified to react as a consequence of external events. We view the ability to accurately model realtime scenarios in reactive systems to be of great importance. BP approaches an external environment interaction in its semantics [47] and implementations [17] using the mechanism of super-steps that capture the priority of external events over external ones that reflect the notion of logical execution time [48].

DISCUSSION
There is a qualitative difference between BP models and PN models: BP focuses on breaking systems into requirements, and PN focuses on specifying the components of a system and how they interact. While both approaches have many merits, we argued in this paper that BP is better for specifying system behavior when it is a composition of requirements. To support our claim, we methodologically demonstrated over three problems how PN obliges modelers to specify a mechanism for combining the different behaviors. This results in over-specification, incorrect models, or complicated models that are not directly aligned with the requirements. In view of the long-standing success of Petri nets, we propose, as future work, to get the best of the two by integrating BP semantics (i.e., wait, request, and block) with Petri nets, as previously done with statecharts [49].
The problem of a language that obliges users to say things they do not wish to say does not solely belong to Petri nets. Other programming and modeling languages, compilers, and sometimes even IDEs -share it. We believe that this problem may be rooted in the relationship between programming languages and the early, "hard" version of the linguistic relativity hypothesis. For years, programming languages have directed users to adapt their thinking. Behavioral programming is different in that its primary design goal is to allow its users to specify the system behavior in a natural and intuitive manner that is aligned with how they perceive the system requirements. We are not saying that BP does not share this problem; however, the BP community is devoted to refining and extending the paradigm to eliminate these problems. For example, a recent extension to the paradigm [9] has pointed out that the absence of context idioms in BP obliges users to define mechanisms for specifying context-dependent requirements. These mechanisms either break the alignment to the requirements or break the correctness of the model. To allow users a more natural specification of their context-dependent requirements, the extension of [9] adds context idioms as first-class citizens of the language.
In practice, software projects rarely start with welldefined requirements. Reasonably, it may be related to the challenge of maintaining the requirement documents, the design documents, and the traceability between the requirements and the code. Generally speaking, requirements are how people describe their system. Thus, a shift left of modeling/programming languages towards a more natural specification/programming of the requirements may lead to an evolutionary step in the field. There are other possible solutions and approaches to this problem. We believe that the software engineering community will benefit from adopting modern linguistics approaches and searching for these solutions.