This is a very difficult question, i tried to look for an answer within the slides but could not find one, I'm sorry I can't help with this one :(.
Smart objects must possess means to sense physical phenomena, such as temperature, light or motion, or trigger actions having an effect on the physical reality. Smart objects must be able to:
- identify
- communitcate
- compute
- have direct interaction with the environment
Backscattering works thanks to the RF signals in the environment, through which devices harvest power from RF signals available (such as from TV, smartphones and Wi-Fi transmissions), in order to be able to reflect the signal received. In ambient backscattering, we use existing RF signals without requiring any additional emitting devices. However, we have some drawbacks with this approach:
- we can only achieve low data rate, thus it can only be used for low data rate transmissions such as money payment
- signal might be unavailable
- signal in indoor environments is very weak
A particular case of this technique is the RFID system, in which an RFID tag backscatters the signal harvested from an RFID reader. The main advantage is that the RFID power source is always available in this setting. We observe that when multiple devices operate simultaneously the reaction time of the single devices increases significanly.
Without a central infrastructure, communication becomes more difficult, in fact:
- the lack of a central identity makes it more difficult to organize; to solve this, partecipants must self-organize, which in turn might be complex because of discoverability of other devices
- there is often a limited range of wireless communication, but this can be solved through multi-hop wireless networks
- the possibile mobility of the partecipants can hinder wireless communication ranges; indeed, mobility changes neighborhood relationships (in the case of cellular networks we just move to another physically still tower), and to solve this problems we need adaptive protocols, since routes must be configured adaptively (this can become complicated if the network scales)
- the devices often have a limited battery, which implies that we must be energy efficient
The main components are the following:
- policy: a map from states to action, which alone determines the complete behaviour of the agent; it can be deterministic or stochastic, and if an action selected by the policy is followed by low reward the policy may be changed accordingly
-
value function: a predicion of future reward, with a formula that takes into account the expected cumulative future reward based on the current state, evaluated with a discount factor
$\gamma$ - model: predicts what the environment will do next, however most RL problems are model-free
What is the exploration/exploitation dilemma, and how does the action selection process work in RL systems?
One of the challenges that arise in RL is the trade-off between exploration and exploitation. At each timestep, an RL agent has to choose between two possible choices:
- exploit the current knowledge that it has gained through experience, thus choose actions that has already tried and found to be effective in the past in producing reward
- explore the other possible unknown actions too check the reward they yield, way be greater than the currently known actions
Therefore, the agent has to exploit that it has already experienced in orded to obtain reward, but it has to explore in order to make better action selections in the future. In particular, by only choosing actions that we already know they yield high reward we are guaranteed to maximize the reward given the current knowledge, however we might be stuck on a local maxima. This is the exploration/exploitation dilemma, and we can tackle it by carefully choosing our strategy of selecting the next action for the RL agent.
We can select an action based on 3 strategies:
- random: this is the most naive, and it corresponds to always choose exploration without ever exploiting its past knowledge
-
greedy: it represents the opposite of a random strategy, since by greedily choosing the actions the RL agent always exploits, i.e. always chooses the action that maximizes
$Q_t(a)$ (however we could be stuck on a local maxima as previously mentioned) -
$\varepsilon$ -greedy: this is the best approach, which balances exploration and exploitation based on some probability$\varepsilon$ ; in particular, we behave greedily most of the times, but every once in a while we select a random action among all the possible ones, independently on the action value estimates -
optimistic initial values: we set the initial
$Q$ to proportionately high numbers as if we "hope" that each action already yields a lot of reward, to then be "disappointed" through learning the real (possibly low) rewards of the actions and adjust the correct$Q$ values. The advantage is that this approach encourages to explore more before making the estimates converge
Wireless sensor networks use battery-operated computing and sensing devices, which means that the sensors are battery powered. Thus, because batteries have finite power, and battery replacement is a costly operation, we must be as much energy efficient as we possibly can, especially for large-scale deployments. This implies that low power communication is required, and we must reduce the energy wasted for having the sensors inactive for prolonged periods of time.
In particular, energy is waste for a multitude of different reasons:
- collisions: each time a collision happens (i.e. more than one packet reaches the same node) the content has to be sent again, which inherently implies a waste of energy consumption
- overhearing: if a server receives a node it was not meant to receive it wasted some energy to receive it
- control-packet overhead: the number of control packets has to be as minimal as possible
- idle listening: this is a situation in which the sensor is listening to an idle channel in the possibility of receiving some packets; in particular, we observe that idle listening consumes more than 50% of the energy required for receiving
- overemitting: we want to avoid sending a message when the destination node is not yet ready to receive it
In a wireless sensor network we want a routing protocol to be able to:
- establish a multi-hop communication
- minimize the energy consumption
- maximize the lifetime of the network
However, in WSNs routing poses several challenges due to the unique characteristics and constraints of such networks:
- energy efficiency: since sensor nodes are battery powered, and replacing battery is a costly operation, we must be as much energy efficient as we possibly can with the routing strategy we choose in order to prolong the network's overall lifetime
- dynamic network topology: routing protocols have to be flexible in order to adapt to node failures (due to energy depletions or environmental conditions) or the addition of new nodes; moreover, in some applications the nodes might be mobile which implies that neighborhood relationships might changes and route may vary accordingly
- scalability: as the number of nodes increases, the routing protocol must be able to scale efficiently, indeed scalability is a crucial aspect of desirable routing protocols which involves optimizing routing for networks with large number of nodes
In Aloha based MAC protocols for RFID systems we have at least two possible protocols
-
Framed Slotted Aloha (FSA): this routing protocols divides the time into frames, and each frames is subdivided into slots, and each time the RFID reades issues a frame (which contains the number of slots per frame) each RFID tag will randomly choose a slot of the frame in which they will reply. Then, if only one tag has answered in that slot, we have successfully identified the tag, otherwise if a collision occurred the tags that collided will repeat the process trying to find a slot to be identified.
-
Tree Slotted Aloha (TSA): this is a variant of the FSA, in which slots are executed following a tree, and a new child frame is issued for each collision slot (as for FSA only tags that collided will partecipate in the slot tree)
In a slotted aloha protocol how is estimated the tag population partecipating into intermediate frames?
Define the follwowing variables:
-
$N$ = number of slots per frame (known) -
$n$ = number of tags partecipating into a frame (not known) -
$c_x$ = observed number of slots that had$x$ tags responding in them (known) -
$a_x$ = estimated number of slots that have$x$ tags responding in them (not known)
which directly implies that
-
$c_0$ = observed number of empty slots (known) -
$c_1$ = observed number of identification slots (known) -
$c_k$ = observed number of collision slots (known) -
$a_0$ = estimated number of empty slots (not known) -
$a_1$ = estimated number of identification slots (not known) -
$a_k$ = estimated number of collision slots (not known)
In particular, we have that
Then, to estimate the total number of tags we simply have to compute the following: too lazy to write the formula
The Binary Splitting (BS) protocol works as follows:
- at first a query is issued by the RFID reader to discover the possible tags
- then, assuming that each RFID tag contains an internal counter, if a collision happened all the tags that collided will flip a coin (i.e. generate a random number between 0 and 1) and add the outcome of the coin to their counter; moreover, each other tag that did not partecipate to the collision has to increment its own counter by 1
- at this point we will have some tags that have a counter set to 0, and some other tags with the counter set to a number greater than 0; this implies that we virtually split the set of tags to identify into two groups (hence the name of the protocol)
- now, we repeat the process but only the tags that have their counter set to 0 will reply and partecipate to the collision
- at the end, when only one tag replies, we finally performed an identification, and to identify all the other tags we use the following rule: if none or one tag replies, all tags decrement their counters to 1, which means that we will go up one level on the binary tree that we constructed by splitting the tags into two groups repeatedly
Explain the differences between proactive and reactive routing in sensor networks. Discuss the advantages and disadvantages
Routing protocols are divided into flat and hierarchical protocols. In particular, flat protocols can be in turn subdivided into the following types:
- proactive protocols: this type of protocols always try to keep its routing data up-to-date, and as the name suggests at the beginning of the establishment of the network there is a heavy exchange of packets in order to transfer the necessary data to determine all the possible routes between the available notes. In fact, in proactive protocols each node maintains routes to all reachable destinations at all times, even if there is currently no need to deliver any packet to those destinations. Clearly, this approach has the advantage that when a route is actually needed no delay is required to discover the possible routes, so latency is reduced, but at the cost of a lot of overhead at the beginning of the process. Moreover, this approach obviously requires that the information present on the routing tables is always up-to-date, which is not only very hard to obtain in practice if the network topology can change frequently, but it can be outright impossible.
- reactive protocols: the other type of flat protocols is reactive, and as the name suggests routes are only determined when they are actually needed. Indeed, when a path between a source and a destination is required, a kind of a global search procedure is initiated in order to find a route between the two nodes. Clearly, this procedure will inherently add a discovery latency each time a route has to be discovered, however this problem can be partially mitigated by caching the routes inside the single nodes.
In conclusion, proactive approached are fast but involve much overhead, while reactive approaches generate much less overhead but they are comparably slower.
Define the agent state and the environment state, and explain how these two states differ. Give a practical example
In Reinforcement Learning the state is the information used to determine what happens next. Formally, we define the history
Then, we define the state of the agent as a function of the history as follows
When the environment state is known by the agent state, we have that
- a trading agent which only observes current prices and does not have full knowledge of the market
- a robot with some sensors that is inherently limited by the sensor it possesses and cannot fully know its environment
Describe the operation of a geographic routing protocol in a sensor network, and discuss some of its challenges. Describe georouting in the context of UAVs
One possible strategy for a routing protocol in WSNs is a geographic protocol, in whic, we assume that each node keeps some routing tables of the next hop but the information is based on GPS coordinates of the other sensors. There are approaches to send the packets:
- nearest node: this is the most straightforward approach, and aims at minimizing the transmission power, and involves sending the packet to our nearest node
- most forward within range $r$: this strategy consists in sending our packet to the neighbor that is closer to the destination (which is different from the furthest away from the sender)
- directional routing: since we know the precise coordinates of sensors, we can also evaluate the node that is closest in the direction of the destination (i.e. the connection line that connects the hop to the destination node); we observe that this strategy might get stuck in loops or dead ends, depending on the topology of the network
Nevertheless, in general geographic routing approaches allow to save more energy than proactive and reactive networks. In fact, this is the reason why this type of protocols are utilized for instance in DRONETs, since drones are usually equipped with GPS devices and we clearly want to minimize energy consumption of our UAVs. In particular, geographic routing for UAVNETs is based on three main approaches:
- greedy forwarding: this is essentially the most forward within range $r$ technique described above, in which we select the node that is geographically closest to the target destination, however in the context of drones we may get stuck on a local optimum in which the process is blocked at a node that is considered the closest, and cannot find any other closer relay nodes
- store-carry and forward: with this strategy we leverage the possibility of UAVs of moving, in fact if a node does not have any relay node to send its packet to (maybe due to not having any other drone in its transmission range) the UAV can phisically carry the packet until meeting another node (or possibly the target destination itself)
- prediction: since we know the GPS location of the drones, assuming we also know drection and speed of the other UAVs we are able to predict their next location and levarage the next position of a relay node
To solve the offloading problem we have the following distributed solution called stop & route, which is applied by the Distributed Gathering Algorithm (DGA), in which during the building phase each UAV:
- if it is in communication range with the depot, or any other UAV connected to the depot, it stops
- if it is in communication range with a UAV not connected to the depot, it follows (called follow leader mechanism)
- if it is isolated, it keeps moving until it finds some UAV to follow
CSMA/CA stands for Carrier Sense Multiple Access with Collision Avoidance, and it is a MAC (Medium Access Control) protocol. In particular, we employ this protocol in WSNs because Collision Detection is not possible to performe in a wireless scenario, thus we must directly avoid collisions at all costs. Let the Intra-Frame Spacing (IFS) be some amount of time, and let the DIFS and SIFS be two amounts of time that are longer and shorter than the IFS, respectively. Then, the CSMA/CA protocol works as follows:
- the source node waits a DIFS, and then sends an RTS to the destination node
- then, the destination waits a SIFS and sends a CTS back to the source node (we observe that RTS and CTS are actually optional in the CSMA/CA)
- now, the source node will wait again a SIFS and will send its packet containing the actual DATA
- finally, after waiting a SIFS the destination node will send back an ACK to the sender node
- after the transmission between these two nodes is completed, and having waited for a DIFS, each other node enters the Contention Window, the phase in which the other nodes will have to contend the medium (i.e. the channel, hence the definition contention-based protocol); in particular, each node will try to access the medium, and if it fails because some other node already won the contention it will enter the Binary Exponential Backoff phase. When a node is inside the Contention Window it generates a random number in
$[0, \mbox{CW} - 1]$ ; then, if the node loses the contention (i.e. there is a collision, which is inferred whenever no ACK is received back from the destination node) it will double the initially fixed value of$\mbox{CW}$ and it will regenerate another random number in the range.
The first station whose contention clock expires will start transmissing, and the other nodes will then sense that the medium is busy, which means that they will freeze their clock then restart them after the completion of the current transmission (i.e., in the next contention period).
Describe the Hidden and Exposed terminal problems, and how CSMA/CA is able to solve both problems
The Hidden terminal problem is defined as follows: we have a setting in which there is a terminal B between two terminals A and C, however B is in range of communication of both A and C, while A is not in C's range and C is not in A's range either. Now, suppose that both A and C want to communicate with B: then, they will both send their messages to B, however, if they send their messages at the same time they will collide when they will reach B, meaning that the communication failed, and nor A neither C knows the presence and the attempt of the other terminal, respectively. The problem is called hidden terminal problem, since in some sense A is hidden from C, and vice versa.
Differently, the Exposed terminal problem is defined in the following setting: consider the same scenario described before, and add a new terminal D such that it is only in C's communication range, but most importantly D and A's communication ranges do not intersect. Now, if B wants to communicate with A, he can send his message, but C will overhear this communication and he could mistakenly assume that everything that is in its range is not reachable due to the possibility of collision with whatever B is sendind to A. However, since A and D's communication ranges have empty intersection this is a wrong assumption and the communication would not result in any collision.
CSMA/CA with RTS and CTS is able to solve this problem thanks to the Request To Send (RTS) and Clear To Send (CTS) messages.
- Consider the scenario of the HTP: when A wants to communicate with B, it will send a RTS packet to B; then, if B is ready to receive the packet it will reply with a CTS message which will be overheard by C, meaning that C now knows that B is already trying to communicate with another device. This means that C will not try to communicate with B anymore.
- Now, consider the scenario of the ETP: when B wants to communicate with A, he will send an RTS packet analogously, and if A can receive the actual data without any issue it will reply with a CTS packet. However, since C is not in A's communication range it won't overhear its CTS answer and it will assume that B's communication failed, therefore deducing that it can freely use the medium and communicate with D without any issue.
Explain how a MAC protocol for sensor networks can address/mitigate the problem of energy consumption
To reduce the problem of energy consumption through MAC protocols we can adopt the sleep strategy (thus use what is called S-MAC, where the S stands for Sleep). S-MAC is a contention-based protocol that extends CSMA/CA in order to reduce idle listening. In fact, idle listening consumes more than 50% of the energy required for receiving, therefore this protocol aims at shutting the nodes down as much as possible, i.e. when it is guaranteed that the medium is already utilized. In particular, the S-MAC protocol aims at reducing the duty cycle of a node, which is the ratio between the listened time and the sleep + listened time (the effective utilization of the sensor).
Clearly, this idea is based on a big assumption: sleep must be synchronized, because we want to avoid the situation in which a node wants to talk to another node that is sleeping. To avoid this scenario, S-MAC provides SYNC packets which exchange sleep data in order to synchronize. Initially, each node will decide their sleep schedule, but after exchanging SYNC packets the objective is to form virtual clusters of neighboring nodes that have the same sleep schedule in order to reduce overhead. Moreover, this packets are utilized to keep the sleep schedules up-to-date and adjust the timer of each node thanks to the Next Sleep Time inside the SYNC packet. We observe that nodes at the border of multiple clusters will have to stay awake more than usual because they have to adjust to all the sleep schedules of all the virtual clusters that he belongs to simultaneously. The process works as follows:
- at first, a node listens to the medium for a period ot time
- then, if no node send a sleep schedule he will choose its own sleep schedule and sent a SYNC packet to inform the other sensors immediately (indeed, this node is called synchronizer)
- then, if a node receives a schedule from a neighboring node before choosing its own, he will follow the schedule (and such node is called follower)
Lastly, one other layer of mechanisms against idle listening can be achieved by implementing the Network Allocation Vector (NAV), which is basically a counter that is exchanged in the header of each packet, and all immediate neighbors of both the sender and the receiver of the packet instantaneously know that they have to sleep. In fact, the NAV is a way to make the medium virtually busy, therefore the medium is now defined to be free only if it is both physically and virtually free.
One of the possible approaches available for tag identification in RFID system is the Query Tree MAC protocol. The idea is fairly simple:
- the RFID reader queries a binary string, starting from the empty string
- each tag that has that very same binary string as prefix of its own ID will reply, otherwise it will not reply anything
- everytime a collision occurs, the reader will add a bit to the previously sent binary string (once 0, then 1), and it will slowly try all the possible binary strings
- if only one tag replies, i.e. there is no collision, the reader performed an identification.
Although fairly simple to implement, we see that in the case of uniform ID distribution among the tag population, the tree induced by the Query Tree approach and the Binary Splitting protocol is the same. This happens because a set of uniformly distribted tags splits approximately equally at each query, as in the BS protocol. This direcly implies that the performance of QT is the same of the BS protocol, which means that we can achieve roughly 36% system efficiency (which is measured as
The TSA protocol has a problem: its performance directly depends on the size of the population that the RFID reader has to identify. In particular, if too many tags choose the same slot there are a lot of collisions, and if too few tags are active there are a lot of empty slots, and both scenarios lead to a waste in time and energy, which must be avoided at all costs.
Therefore, since
However, there is an issue with this approach: the estimator does not capture the possibly high variance of the number of tags, which can lead to a very inadequate estimation on
The BSTSA protocol takes the best aspects of Binary Splitting and Tree Slotted Aloha. In particular, it works as follows:
- first, the Binary Splitting procedure is run, through which the tags are divided into smaller and smaller groups, by building a binary tree in which we repeatedly split only the left siblings in the tree
- next, after we reach a group with a single tag on the left sibling on our tree, we apply the TSA protocol on the right siblings of the tree, starting from the last level going up to the first
This allows to predict the performance and the tag population more easily due to the splitting process did in advance, thus reducing variance and improving stability. Indeed, we experimentally see that the protocol can yield a time system efficiency up to 80%.
BATMAN stands for Better Approach To Mobile Ad-hoc Networking Protocol, and it is a proactive distance-vector based routing protocol employed (as the name suggests) in MANETs and Mesh Networks. It is designed for decentralized decision-making and self-organizing network, however the main focus of the protocol is that it prioritizes link quality and robustness in order to establish connections between hops. Indeed, the routing tables kept by the nodes are based on the link quality of the neighbors, and the latter is periodically evaluated through the Originator Messages (OGM).
In particular, in each OGM there is a sequence number which ensures the message freshness, and a Link Quality Index that describes the quality of the connection. However, this raises a problem: how is link quality estimated? The Receive Quality (RQ) is the quality based on the received data from a neighbor, but this is not enough to assume that the link quality sending a packet to the very same neighbor is identical (in fact, in this protocol the link quality is asymmetrical). To estimate link quality accurately, each node broadcasts a packet to its neighbors and checks if such packet returns back. This is used to evaluate the so called Echo Quality (EQ). However, since theEQ is clearly directly proportional to both the Transmission Quality (TQ) and the RQ, set