Skip to content

Instantly share code, notes, and snippets.

@ChuanyuXue
Forked from jeez/ Scheduled Tx Tools
Created April 15, 2022 19:02

Revisions

  1. Jesus Sanchez-Palencia revised this gist Oct 31, 2018. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions check_clocks.c
    Original file line number Diff line number Diff line change
    @@ -113,8 +113,8 @@ int main(int argc, char** argv)
    printf("rt latency:\t%llu\n", lat_rt);
    printf("tai latency:\t%llu\n", lat_tai);
    printf("phc latency:\t%llu\n", lat_ptp);
    printf("phc-rt delta:\t%llu\n", ptp - rt);
    printf("phc-tai delta:\t%llu\n", ptp - tai);
    printf("phc-rt delta:\t%lld\n", ptp - rt);
    printf("phc-tai delta:\t%lld\n", ptp - tai);

    close(fd_ptp);

  2. Jesus Sanchez-Palencia revised this gist Oct 8, 2018. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions README.etf
    Original file line number Diff line number Diff line change
    @@ -48,12 +48,12 @@ e.g.: $ sudo tcpdump -c 60000 -i enp3s0 -w tmp.pcap \
    Our DUT uses an Intel i210 NIC, and our setup here is as follows.

    1.a) First, we setup mqprio as the root qdisc:
    e.g.: $ sudo tc qdisc replace dev IFACE parent root mqprio \
    e.g.: $ sudo tc qdisc replace dev IFACE parent root handle 100 mqprio \
    num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 2@2 hw 0

    1.b) Then we setup etf with the desired config:
    e.g.: $ sudo tc qdisc add dev enp2s0 parent 8001:1 etf \
    e.g.: $ sudo tc qdisc add dev enp2s0 parent 100:1 etf \
    offload clockid CLOCK_TAI delta 150000


  3. Jesus Sanchez-Palencia revised this gist Oct 3, 2018. 2 changed files with 2 additions and 3 deletions.
    2 changes: 1 addition & 1 deletion README.classifier
    Original file line number Diff line number Diff line change
    @@ -25,7 +25,7 @@ $ make
    How to run
    ----------

    $ ./dump-classifier -s <BATCH FILE> -f <FILTER FILE> -b <BASE TIME> -d <DUMP FILE>
    $ ./dump-classifier -s <BATCH FILE> -f <FILTER FILE> -d <DUMP FILE>

    <BATCH FILE> is a text file containg a batch file intended for use
    with 'tc -batch', this allows dump-classifier to use the same file
    3 changes: 1 addition & 2 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -191,5 +191,4 @@ The same can be done for ingress.
    time-slices. The base-time comes from the udp_tai for TC 0 minus the
    250us txtime offset as used below. For example:

    ./dump-classifier -b 1528320726000000000 -d tmp.pcap -f filter \
    -s taprio.batch | grep -v ontime
    ./dump-classifier -d tmp.pcap -f filter -s taprio.batch | grep -v ontime
  4. Jesus Sanchez-Palencia revised this gist Sep 28, 2018. 3 changed files with 344 additions and 110 deletions.
    24 changes: 19 additions & 5 deletions README.classifier
    Original file line number Diff line number Diff line change
    @@ -25,15 +25,29 @@ $ make
    How to run
    ----------

    $ ./dump-classifier -s <SCHED FILE> -f <FILTER FILE> -b <BASE TIME> -d <DUMP FILE>
    $ ./dump-classifier -s <BATCH FILE> -f <FILTER FILE> -b <BASE TIME> -d <DUMP FILE>

    <SCHED FILE> is a text file containg the traffic schedule, the format
    is exactly the same as taprio (the qdisc) accepts.
    <BATCH FILE> is a text file containg a batch file intended for use
    with 'tc -batch', this allows dump-classifier to use the same file
    used for configuring the qdiscs.

    Example:
    -----<cut
    S 01 500000
    S 02 500000
    qdisc replace dev enp3s0 parent root handle 100 taprio \
    num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 2@2 \
    base-time 1536883100000000000 \
    sched-entry S 01 300000 \
    sched-entry S 02 300000 \
    sched-entry S 04 400000 \
    clockid CLOCK_TAI

    qdisc replace dev enp3s0 parent 100:1 etf \
    offload delta 300000 clockid CLOCK_TAI

    qdisc replace dev enp3s0 parent 100:2 etf clockid CLOCK_TAI \
    delta 300000 offload deadline_mode
    ----->end

    <FILTER FILE> allows different traffic classes to be indentified in a
    155 changes: 87 additions & 68 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -46,70 +46,48 @@ CLOCK_TAI is the reference clockid used throughout the example for the
    qdiscs and the applications.


    # LISTENER #
    1) Setup network
    sudo ip addr add 192.168.0.78/4 broadcast 192.168.0.255 dev enp3s0


    2) Start time sync (ptp master)
    sudo ./setup_clock_sync.sh -i enp3s0 -m -v


    3) Start iperf server
    iperf3 -s


    4) Prepare 'dump-classifier' files. Please refer to
    README.classifier for further information.

    --filters--
    talker_strict :: udp port 7788
    talker_deadline :: udp port 7798

    --gates.sched--
    S 0x1 300000
    S 0x2 300000
    S 0x4 400000


    5) Start capturing traffic:

    sudo tcpdump -c 600000 -i enp3s0 -w tmp.pcap -j adapter_unsynced \
    -tt --time-stamp-precision=nano

    6) Use the talkers to transmit packets as described on the next section.

    # NOTE ON VLAN USAGE #

    6) After traffic was captured, check if packets arrived outside of their
    time-slices. The base-time comes from the udp_tai for TC 0 minus the
    250us txtime offset as used below. For example:
    If your tests require that VLAN tagging is performed by the end stations, then
    you must configure the kernel to do so. There are different ways to approach that,
    one of them is to create a vlan interface that knows how to map from a socket
    priority to the VLAN PCP.

    ./dump-classifier -b 1528320726000000000 -d tmp.pcap -f filter \
    -s gates.sched | grep -v ontime
    e.g.: $ ip link add link enp2s0 name enp2s0.2 type vlan id 2 egress 2:2 3:3
    $ ip link set dev enp2s0.2 up

    This maps socket priority 2 to PCP 2 and 3 to 3 for egress on a VLAN with id 2.
    The same can be done for ingress.


    # TALKER #
    1) Setup network
    sudo ip addr add 192.168.0.77/4 broadcast 192.168.0.255 dev enp3s0
    sudo ip addr add 192.168.0.77/4 broadcast 192.168.0.255 dev enp3s0

    2) Setup qdiscs

    2.0) Prepare sched file for taprio
    --gates.sched--
    S 0x1 300000
    S 0x2 300000
    S 0x4 400000
    The script 'config-taprio.sh', will configure taprio and ETF,
    automatically, with the same parameters explained below. It
    will also save on the 'taprio.batch' file the configuration
    used, so it can be used for analysis.

    The rest of Section 2 describes taprio and etf configuration
    parameters briefly.

    2.1) Setup taprio with a base-time starting in 2min from now rounded down.
    We must add the 37s UTC-TAI offset to the timestamp we get with 'date'.

    i=$((`date +%s%N` + 37000000000 + (2 * 60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000))) ; \
    tc qdisc add dev enp3s0 parent root handle 100 taprio num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 \
    sched-file gates.sched base-time $base clockid CLOCK_TAI
    i=$((`date +%s%N` + 37000000000 + (2 * 60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000))) ; \
    tc qdisc replace dev enp3s0 parent root handle 100 taprio \
    num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 2@2 \
    base-time $base \
    sched-entry S 01 300000 \
    sched-entry S 02 300000 \
    sched-entry S 04 400000 \
    clockid CLOCK_TAI

    We can read the above as:
    - there are 3 traffic classes (num_tc 3);
    @@ -123,54 +101,95 @@ qdiscs and the applications.

    2.2) Setup etf for queue TC 0:

    tc qdisc replace dev enp3s0 parent 100:1 etf clockid CLOCK_TAI \
    delta 200000 offload
    tc qdisc replace dev enp3s0 parent 100:1 etf clockid CLOCK_TAI \
    delta 200000 offload


    2.3) Setup etf in deadline mode for TC 1:

    tc qdisc replace dev enp3s0 parent 100:2 etf clockid CLOCK_TAI \
    delta 200000 offload deadline_mode
    tc qdisc replace dev enp3s0 parent 100:2 etf clockid CLOCK_TAI \
    delta 200000 offload deadline_mode


    3) Start time sync (ptp slave):
    sudo ./setup_clock_sync.sh -i enp3s0 -s -v
    sudo ./setup_clock_sync.sh -i enp3s0 -s -v


    4) Start iperf3 client:
    iperf3 -c 192.168.0.78 -t 600 --fq-rate 100M
    iperf3 -c 192.168.0.78 -t 600 --fq-rate 100M


    5) Start udp_tai for TC 0. Use a base-time starting in 1min from now + a
    250us offset for txtime:

    now=`date +%s%N` ; i=$(($now + 37000000000 + (60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000) + 250000)) ; \
    sudo ./udp_tai -i enp3s0 -b $base -P 1000000 -t 3 -p 90 -d 600000 \
    now=`date +%s%N` ; i=$(($now + 37000000000 + (60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000) + 250000)) ; \
    sudo ./udp_tai -i enp3s0 -b $base -P 1000000 -t 3 -p 90 -d 600000 \
    -u 7788

    To automate this process a little, the script
    'run-udp-tai-tc0.sh' is provided.

    6) Start udp_tai in deadline mode for TC 1. Use the txtime computed for
    the previous traffic class (above) and add 300us so it falls under the
    second time slice (TC 1). For example, if the instance of udp_tai executed
    on the previous step printed
    "txtime of 1st packet is: 1528320726000250000", then the now we should do:

    sudo ./udp_tai -i enp3s0 -t 2 -p 90 -D -d 600000 \
    sudo ./udp_tai -i enp3s0 -t 2 -p 90 -D -d 600000 \
    -b 1528320726000550000 -u 7798

    # LISTENER #
    1) Setup network
    sudo ip addr add 192.168.0.78/4 broadcast 192.168.0.255 dev enp3s0


    # NOTE ON VLAN USAGE #
    2) Start time sync (ptp master)
    sudo ./setup_clock_sync.sh -i enp3s0 -m -v

    If your tests require that VLAN tagging is performed by the end stations, then
    you must configure the kernel to do so. There are different ways to approach that,
    one of them is to create a vlan interface that knows how to map from a socket
    priority to the VLAN PCP.

    e.g.: $ ip link add link enp2s0 name enp2s0.2 type vlan id 2 egress 2:2 3:3
    $ ip link set dev enp2s0.2 up
    3) Start iperf server
    iperf3 -s

    This maps socket priority 2 to PCP 2 and 3 to 3 for egress on a VLAN with id 2.
    The same can be done for ingress.

    4) Prepare 'dump-classifier' files. Running 'config-taprio.sh'
    should produce a 'taprio.batch' file, it will be used for
    verifying how well the schedule specified there was followed.
    Please refer to README.classifier for further information.

    --filters--
    talker_strict :: udp port 7788
    talker_deadline :: udp port 7798


    --taprio.batch--
    qdisc replace dev enp3s0 parent root handle 100 taprio \
    num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 2@2 \
    base-time 1536883100000000000 \
    sched-entry S 01 300000 \
    sched-entry S 02 300000 \
    sched-entry S 04 400000 \
    clockid CLOCK_TAI

    qdisc replace dev enp3s0 parent 100:1 etf \
    offload delta 300000 clockid CLOCK_TAI

    qdisc replace dev enp3s0 parent 100:2 etf clockid CLOCK_TAI \
    delta 300000 offload deadline_mode

    5) Start capturing traffic:

    sudo tcpdump -c 600000 -i enp3s0 -w tmp.pcap -j adapter_unsynced \
    -tt --time-stamp-precision=nano

    6) Use the talkers to transmit packets as described on the next section.


    6) After traffic was captured, check if packets arrived outside of their
    time-slices. The base-time comes from the udp_tai for TC 0 minus the
    250us txtime offset as used below. For example:

    ./dump-classifier -b 1528320726000000000 -d tmp.pcap -f filter \
    -s taprio.batch | grep -v ontime
    275 changes: 238 additions & 37 deletions dump-classifier.c
    Original file line number Diff line number Diff line change
    @@ -18,6 +18,23 @@

    #define NUM_FILTERS 8
    #define NUM_ENTRIES 64
    #define MAX_ARGS 100

    /* TAPRIO */
    enum {
    TC_TAPRIO_CMD_SET_GATES = 0x00,
    TC_TAPRIO_CMD_SET_AND_HOLD = 0x01,
    TC_TAPRIO_CMD_SET_AND_RELEASE = 0x02,
    };

    #define NEXT_ARG() \
    do { \
    argv++; \
    if (--argc <= 0) { \
    fprintf(stderr, "Incomplete command\n"); \
    exit(-1); \
    } \
    } while (0)

    enum traffic_flags {
    TRAFFIC_FLAGS_TXTIME,
    @@ -37,38 +54,36 @@ struct sched_entry {

    struct schedule {
    struct sched_entry entries[NUM_ENTRIES];
    uint64_t base_time;
    int64_t base_time;
    size_t current_entry;
    size_t num_entries;
    uint64_t cycle_time;
    int64_t cycle_time;
    };

    static struct argp_option options[] = {
    {"sched-file", 's', "SCHED_FILE", 0, "File containing the schedule" },
    {"batch-file", 's', "BATCH_FILE", 0, "File containing the taprio configuration" },
    {"dump-file", 'd', "DUMP_FILE", 0, "File containing the tcpdump dump" },
    {"filters-file", 'f', "FILTERS_FILE", 0, "File containing the classfication filters" },
    {"base-time", 'b', "TIME", 0, "Timestamp indicating when the schedule starts" },
    { 0 }
    };

    static struct tc_filter traffic_filters[NUM_FILTERS];
    static FILE *sched_file, *dump_file, *filters_file;
    static FILE *batch_file, *dump_file, *filters_file;
    static struct schedule schedule;
    static uint64_t base_time;

    static error_t parser(int key, char *arg, struct argp_state *state)
    {
    switch (key) {
    case 'd':
    dump_file = fopen(arg, "r");
    if (!dump_file) {
    case 's':
    batch_file = fopen(arg, "r");
    if (!batch_file) {
    perror("Could not open file, fopen");
    exit(EXIT_FAILURE);
    }
    break;
    case 's':
    sched_file = fopen(arg, "r");
    if (!sched_file) {
    case 'd':
    dump_file = fopen(arg, "r");
    if (!dump_file) {
    perror("Could not open file, fopen");
    exit(EXIT_FAILURE);
    }
    @@ -80,9 +95,6 @@ static error_t parser(int key, char *arg, struct argp_state *state)
    exit(EXIT_FAILURE);
    }
    break;
    case 'b':
    base_time = strtoull(arg, NULL, 0);
    break;
    }

    return 0;
    @@ -92,36 +104,97 @@ static struct argp argp = { options, parser };

    static void usage(void)
    {
    fprintf(stderr, "dump-classifier -s <sched-file> -d <dump-file> -f <filters-file> -b <base-time>\n");
    fprintf(stderr, "dump-classifier -s <tc batch file> -d <dump-file> -f <filters-file>\n");
    }

    static int parse_schedule(FILE *file, struct schedule *schedule,
    size_t max_entries, uint64_t base_time)
    /* split command line into argument vector */
    int makeargs(char *line, char *argv[], int max_args)
    {
    uint32_t interval, gatemask;
    size_t i = 0;

    while (fscanf(file, "%*s %x %" PRIu32 "\n",
    &gatemask, &interval) != EOF) {
    struct sched_entry *entry;
    static const char ws[] = " \t\r\n";
    char *cp = line;
    int argc = 0;

    if (i >= max_entries)
    return -EINVAL;
    while (*cp) {
    /* skip leading whitespace */
    cp += strspn(cp, ws);

    entry = &schedule->entries[i];
    if (*cp == '\0')
    break;

    entry->gatemask = gatemask;
    entry->interval = interval;
    if (argc >= (max_args - 1))
    return -1;

    /* word begins with quote */
    if (*cp == '\'' || *cp == '"') {
    char quote = *cp++;

    argv[argc++] = cp;
    /* find ending quote */
    cp = strchr(cp, quote);
    if (cp == NULL) {
    fprintf(stderr, "Unterminated quoted string\n");
    exit(1);
    }
    } else {
    argv[argc++] = cp;

    /* find end of word */
    cp += strcspn(cp, ws);
    if (*cp == '\0')
    break;
    }

    i++;
    /* seperate words */
    *cp++ = 0;
    }
    argv[argc] = NULL;

    schedule->base_time = base_time;
    schedule->current_entry = 0;
    schedule->num_entries = i;
    return argc;
    }

    return i;
    /* Like glibc getline but handle continuation lines and comments */
    ssize_t getcmdline(char **linep, size_t *lenp, FILE *in)
    {
    ssize_t cc;
    char *cp;

    cc = getline(linep, lenp, in);
    if (cc < 0)
    return cc; /* eof or error */

    cp = strchr(*linep, '#');
    if (cp)
    *cp = '\0';

    while ((cp = strstr(*linep, "\\\n")) != NULL) {
    char *line1 = NULL;
    size_t len1 = 0;
    ssize_t cc1;

    cc1 = getline(&line1, &len1, in);
    if (cc1 < 0) {
    fprintf(stderr, "Missing continuation line\n");
    return cc1;
    }

    *cp = 0;

    cp = strchr(line1, '#');
    if (cp)
    *cp = '\0';

    *lenp = strlen(*linep) + strlen(line1) + 1;
    *linep = realloc(*linep, *lenp);
    if (!*linep) {
    fprintf(stderr, "Out of memory\n");
    *lenp = 0;
    return -1;
    }
    cc += cc1 - 2;
    strcat(*linep, line1);
    free(line1);
    }
    return cc;
    }

    static int parse_filters(pcap_t *handle, FILE *file,
    @@ -150,6 +223,134 @@ static int parse_filters(pcap_t *handle, FILE *file,
    return i;
    }

    static int str_to_entry_cmd(const char *str)
    {
    if (strcmp(str, "S") == 0)
    return TC_TAPRIO_CMD_SET_GATES;

    if (strcmp(str, "H") == 0)
    return TC_TAPRIO_CMD_SET_AND_HOLD;

    if (strcmp(str, "R") == 0)
    return TC_TAPRIO_CMD_SET_AND_RELEASE;

    return -1;
    }

    int get_u32(uint32_t *val, const char *arg, int base)
    {
    unsigned long res;
    char *ptr;

    if (!arg || !*arg)
    return -1;
    res = strtoul(arg, &ptr, base);

    /* empty string or trailing non-digits */
    if (!ptr || ptr == arg || *ptr)
    return -1;

    /* overflow */
    if (res == ULONG_MAX && errno == ERANGE)
    return -1;

    /* in case UL > 32 bits */
    if (res > 0xFFFFFFFFUL)
    return -1;

    *val = res;
    return 0;
    }

    int get_s64(int64_t *val, const char *arg, int base)
    {
    long res;
    char *ptr;

    errno = 0;

    if (!arg || !*arg)
    return -1;
    res = strtoll(arg, &ptr, base);
    if (!ptr || ptr == arg || *ptr)
    return -1;
    if ((res == LLONG_MIN || res == LLONG_MAX) && errno == ERANGE)
    return -1;
    if (res > INT64_MAX || res < INT64_MIN)
    return -1;

    *val = res;
    return 0;
    }

    static int parse_batch_file(FILE *file, struct schedule *schedule, size_t max_entries)
    {
    int argc;
    char *arguments[MAX_ARGS];
    char **argv;
    size_t len;
    char *line = NULL;
    int err;

    if (getcmdline(&line, &len, file) < 0) {
    fprintf(stderr, "Could not read batch file\n");
    exit(EXIT_FAILURE);
    }

    argc = makeargs(line, arguments, MAX_ARGS);
    if (argc < 0) {
    fprintf(stderr, "Could not parse arguments\n");
    return -1;
    }

    argv = arguments;

    while (argc > 0) {
    if (strcmp(*argv, "sched-entry") == 0) {
    struct sched_entry *e;

    if (schedule->num_entries >= max_entries) {
    fprintf(stderr, "The maximum number of schedule entries is %zu\n", max_entries);
    return -1;
    }

    e = &schedule->entries[schedule->num_entries];

    NEXT_ARG();
    err = str_to_entry_cmd(*argv);
    if (err < 0) {
    fprintf(stderr, "Could not parse command (found %s)\n", *argv);
    return -1;
    }
    e->command = err;

    NEXT_ARG();
    if (get_u32(&e->gatemask, *argv, 16)) {
    fprintf(stderr, "Could not parse gatemask (found %s)\n", *argv);
    return -1;
    }

    NEXT_ARG();
    if (get_u32(&e->interval, *argv, 0)) {
    fprintf(stderr, "Could not parse interval (found %s)\n", *argv);
    return -1;
    }

    schedule->num_entries++;

    } else if (strcmp(*argv, "base-time") == 0) {
    NEXT_ARG();
    if (get_s64(&schedule->base_time, *argv, 10)) {
    fprintf(stderr, "Could not parse base-time (found %s)\n", *argv);
    return -1;
    }
    }
    argc--; argv++;
    }

    return 0;
    }

    /* libpcap re-uses the timeval struct for nanosecond resolution when
    * PCAP_TSTAMP_PRECISION_NANO is specified.
    */
    @@ -309,13 +510,13 @@ int main(int argc, char **argv)

    argp_parse(&argp, argc, argv, 0, NULL, NULL);

    if (!dump_file || !sched_file || !filters_file || !base_time) {
    if (!dump_file || !batch_file || !filters_file) {
    usage();
    exit(EXIT_FAILURE);
    }

    err = parse_schedule(sched_file, &schedule, NUM_ENTRIES, base_time);
    if (err <= 0) {
    err = parse_batch_file(batch_file, &schedule, NUM_ENTRIES);
    if (err < 0) {
    fprintf(stderr, "Could not parse schedule file (or file empty)\n");
    exit(EXIT_FAILURE);
    }
  5. Jesus Sanchez-Palencia revised this gist Sep 28, 2018. 3 changed files with 93 additions and 0 deletions.
    31 changes: 31 additions & 0 deletions config-etf.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,31 @@
    #!/bin/bash
    #
    # Copyright (c) 2018, Intel Corporation
    #
    # SPDX-License-Identifier: BSD-3-Clause
    #

    IFACE=$1

    if [ -z $IFACE ]; then
    echo "You must provide the network interface as first argument"
    exit -1
    fi

    BATCH_FILE=etf.batch

    cat > $BATCH_FILE <<EOF
    qdisc replace dev $IFACE parent root handle 100 mqprio \\
    num_tc 3 \\
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\
    queues 1@0 1@1 2@2 \\
    hw 0
    qdisc replace dev enp3s0 parent 100:1 etf \\
    offload delta 300000 clockid CLOCK_TAI
    qdisc replace dev enp3s0 parent 100:2 etf clockid CLOCK_TAI \\
    delta 300000 offload deadline_mode
    EOF

    tc -batch $BATCH_FILE
    41 changes: 41 additions & 0 deletions config-taprio.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,41 @@
    #!/bin/bash
    #
    # Copyright (c) 2018, Intel Corporation
    #
    # SPDX-License-Identifier: BSD-3-Clause
    #

    IFACE=$1

    if [ -z $IFACE ]; then
    echo "You must provide the network interface as first argument"
    exit -1
    fi

    i=$((`date +%s%N` + 37000000000 + (2 * 60 * 1000000000)))

    BASE_TIME=$(($i - ($i % 1000000000)))
    BATCH_FILE=taprio.batch

    cat > $BATCH_FILE <<EOF
    qdisc replace dev $IFACE parent root handle 100 taprio \\
    num_tc 3 \\
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\
    queues 1@0 1@1 2@2 \\
    base-time $BASE_TIME \\
    sched-entry S 01 300000 \\
    sched-entry S 02 300000 \\
    sched-entry S 04 400000 \\
    clockid CLOCK_TAI
    qdisc replace dev $IFACE parent 100:1 etf \\
    offload delta 200000 clockid CLOCK_TAI
    qdisc replace dev $IFACE parent 100:2 etf clockid CLOCK_TAI \\
    delta 200000 offload deadline_mode
    EOF

    tc -batch $BATCH_FILE

    echo "Base time: $BASE_TIME"
    echo "Configuration saved to: $BATCH_FILE"
    21 changes: 21 additions & 0 deletions run-udp-tai-tc0.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,21 @@
    #!/bin/sh
    #
    # Copyright (c) 2018, Intel Corporation
    #
    # SPDX-License-Identifier: BSD-3-Clause
    #

    IFACE=$1

    if [ -z $IFACE ]; then
    echo "You must provide the network interface as first argument"
    exit -1
    fi

    # Now plus 1 minute
    PLUS_1MIN=$((`date +%s%N` + 37000000000 + (60 * 1000000000)))

    # Base will the next "round" timestamp ~1 min from now, plus 250us
    BASE=$(($PLUS_1MIN - ( $PLUS_1MIN % 1000000000 ) + 250000))

    sudo ./udp_tai -i $IFACE -b $BASE -P 1000000 -t 3 -p 90 -d 600000 -u 7788
  6. Jesus Sanchez-Palencia revised this gist Sep 20, 2018. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -69,7 +69,7 @@ qdiscs and the applications.
    --gates.sched--
    S 0x1 300000
    S 0x2 300000
    S 0xc 400000
    S 0x4 400000


    5) Start capturing traffic:
    @@ -99,7 +99,7 @@ qdiscs and the applications.
    --gates.sched--
    S 0x1 300000
    S 0x2 300000
    S 0xc 400000
    S 0x4 400000


    2.1) Setup taprio with a base-time starting in 2min from now rounded down.
  7. Jesus Sanchez-Palencia revised this gist Sep 19, 2018. 2 changed files with 29 additions and 0 deletions.
    15 changes: 15 additions & 0 deletions README.etf
    Original file line number Diff line number Diff line change
    @@ -110,3 +110,18 @@ e.g.: $ tshark -r tmp.pcap --disable-protocol dcp-etsi --disable-protocol \

    $ ./txtime_offset_stats.py -f tmp.out



    # NOTE ON VLAN USAGE #

    If your tests require that VLAN tagging is performed by the end stations, then
    you must configure the kernel to do so. There are different ways to approach that,
    one of them is to create a vlan interface that knows how to map from a socket
    priority to the VLAN PCP.

    e.g.: $ ip link add link enp2s0 name enp2s0.2 type vlan id 2 egress 2:2 3:3
    $ ip link set dev enp2s0.2 up

    This maps socket priority 2 to PCP 2 and 3 to 3 for egress on a VLAN with id 2.
    The same can be done for ingress.

    14 changes: 14 additions & 0 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -160,3 +160,17 @@ qdiscs and the applications.
    -b 1528320726000550000 -u 7798



    # NOTE ON VLAN USAGE #

    If your tests require that VLAN tagging is performed by the end stations, then
    you must configure the kernel to do so. There are different ways to approach that,
    one of them is to create a vlan interface that knows how to map from a socket
    priority to the VLAN PCP.

    e.g.: $ ip link add link enp2s0 name enp2s0.2 type vlan id 2 egress 2:2 3:3
    $ ip link set dev enp2s0.2 up

    This maps socket priority 2 to PCP 2 and 3 to 3 for egress on a VLAN with id 2.
    The same can be done for ingress.

  8. Jesus Sanchez-Palencia revised this gist Sep 18, 2018. 1 changed file with 477 additions and 0 deletions.
    477 changes: 477 additions & 0 deletions l2_tai.c
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,477 @@
    /*
    * This program demonstrates transmission of L2 frames using the
    * system TAI timer.
    *
    * Copyright (c) 2018, Intel Corporation
    *
    * Copyright (C) 2017 linutronix GmbH
    *
    * Large portions taken from the linuxptp stack.
    * Copyright (C) 2011, 2012 Richard Cochran <richardcochran@gmail.com>
    *
    * Some portions taken from the sgd test program.
    * Copyright (C) 2015 linutronix GmbH
    *
    * This program is free software; you can redistribute it and/or modify
    * it under the terms of the GNU General Public License as published by
    * the Free Software Foundation; either version 2 of the License, or
    * (at your option) any later version.
    *
    * This program is distributed in the hope that it will be useful,
    * but WITHOUT ANY WARRANTY; without even the implied warranty of
    * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    * GNU General Public License for more details.
    *
    * You should have received a copy of the GNU General Public License along
    * with this program; if not, write to the Free Software Foundation, Inc.,
    * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
    */
    #define _GNU_SOURCE /*for CPU_SET*/
    #include <errno.h>
    #include <ifaddrs.h>
    #include <linux/errqueue.h>
    #include <linux/if_ether.h>
    #include <linux/if_packet.h>
    #include <net/if.h>
    #include <netinet/in.h>
    #include <poll.h>
    #include <pthread.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <sys/ioctl.h>
    #include <unistd.h>

    #define ONE_SEC 1000000000ULL
    #define DEFAULT_PERIOD 1000000
    #define DEFAULT_DELAY 500000
    #define DEFAULT_PRIORITY 3
    #define MARKER 'a'

    #ifndef SO_TXTIME
    #define SO_TXTIME 61
    #define SCM_TXTIME SO_TXTIME
    #endif

    #ifndef SO_EE_ORIGIN_TXTIME
    #define SO_EE_ORIGIN_TXTIME 6
    #define SO_EE_CODE_TXTIME_INVALID_PARAM 1
    #define SO_EE_CODE_TXTIME_MISSED 2
    #endif

    #define pr_err(s) fprintf(stderr, s "\n")
    #define pr_info(s) fprintf(stdout, s "\n")

    /* The API for SO_TXTIME is the below struct and enum, which will be
    * provided by uapi/linux/net_tstamp.h in the near future.
    */
    struct sock_txtime {
    clockid_t clockid;
    uint16_t flags;
    };

    enum txtime_flags {
    SOF_TXTIME_DEADLINE_MODE = (1 << 0),
    SOF_TXTIME_REPORT_ERRORS = (1 << 1),

    SOF_TXTIME_FLAGS_LAST = SOF_TXTIME_REPORT_ERRORS,
    SOF_TXTIME_FLAGS_MASK = (SOF_TXTIME_FLAGS_LAST - 1) |
    SOF_TXTIME_FLAGS_LAST
    };


    static int running = 1, use_so_txtime = 1;
    static int period_nsec = DEFAULT_PERIOD;
    static int waketx_delay = DEFAULT_DELAY;
    static int so_priority = DEFAULT_PRIORITY;
    static int use_deadline_mode = 0;
    static int receive_errors = 0;
    static uint64_t base_time = 0;
    static uint8_t mac_addr[ETH_ALEN] = {0};
    static struct sock_txtime sk_txtime;
    static struct sockaddr_ll addr = {0};

    static void normalize(struct timespec *ts)
    {
    while (ts->tv_nsec > 999999999) {
    ts->tv_sec += 1;
    ts->tv_nsec -= ONE_SEC;
    }

    while (ts->tv_nsec < 0) {
    ts->tv_sec -= 1;
    ts->tv_nsec += ONE_SEC;
    }
    }

    static int sk_interface_index(int fd, const char *name)
    {
    struct ifreq ifreq;
    int err;

    memset(&ifreq, 0, sizeof(ifreq));
    strncpy(ifreq.ifr_name, name, sizeof(ifreq.ifr_name) - 1);
    err = ioctl(fd, SIOCGIFINDEX, &ifreq);
    if (err < 0) {
    pr_err("ioctl SIOCGIFINDEX failed: %m");
    return err;
    }
    return ifreq.ifr_ifindex;
    }

    static int l2_open_socket(const char *name, clockid_t clkid)
    {
    int fd, index, on = 1;
    addr.sll_family = AF_PACKET,
    addr.sll_protocol = htons(ETH_P_TSN),
    addr.sll_halen = ETH_ALEN,

    fd = socket(AF_PACKET, SOCK_DGRAM, htons(ETH_P_TSN));
    if (fd < 0) {
    pr_err("socket failed: %m");
    goto no_socket;
    }

    index = sk_interface_index(fd, name);
    if (index < 0)
    goto no_option;

    addr.sll_ifindex = index;

    if (setsockopt(fd, SOL_SOCKET, SO_PRIORITY, &so_priority, sizeof(so_priority))) {
    pr_err("Couldn't set priority");
    goto no_option;
    }

    memcpy(&addr.sll_addr, mac_addr, ETH_ALEN);

    sk_txtime.clockid = clkid;
    sk_txtime.flags = (use_deadline_mode | receive_errors);
    if (use_so_txtime && setsockopt(fd, SOL_SOCKET, SO_TXTIME, &sk_txtime, sizeof(sk_txtime))) {
    pr_err("setsockopt SO_TXTIME failed: %m");
    goto no_option;
    }

    return fd;
    no_option:
    close(fd);
    no_socket:
    return -1;
    }

    static int l2_send(int fd, void *buf, int len, __u64 txtime)
    {
    char control[CMSG_SPACE(sizeof(txtime))] = {};
    struct cmsghdr *cmsg;
    struct msghdr msg;
    struct iovec iov;
    ssize_t cnt;

    iov.iov_base = buf;
    iov.iov_len = len;

    memset(&msg, 0, sizeof(msg));
    msg.msg_name = &addr;
    msg.msg_namelen = sizeof(addr);
    msg.msg_iov = &iov;
    msg.msg_iovlen = 1;

    /*
    * We specify the transmission time in the CMSG.
    */
    if (use_so_txtime) {
    msg.msg_control = control;
    msg.msg_controllen = sizeof(control);

    cmsg = CMSG_FIRSTHDR(&msg);
    cmsg->cmsg_level = SOL_SOCKET;
    cmsg->cmsg_type = SCM_TXTIME;
    cmsg->cmsg_len = CMSG_LEN(sizeof(__u64));
    *((__u64 *) CMSG_DATA(cmsg)) = txtime;
    }

    cnt = sendmsg(fd, &msg, 0);
    if (cnt < 1) {
    pr_err("sendmsg failed: %m");
    return cnt;
    }

    return cnt;
    }

    static unsigned char tx_buffer[256];

    static int process_socket_error_queue(int fd)
    {
    uint8_t msg_control[CMSG_SPACE(sizeof(struct sock_extended_err))];
    unsigned char err_buffer[sizeof(tx_buffer)];
    struct sock_extended_err *serr;
    struct cmsghdr *cmsg;
    __u64 tstamp = 0;

    struct iovec iov = {
    .iov_base = err_buffer,
    .iov_len = sizeof(err_buffer)
    };
    struct msghdr msg = {
    .msg_iov = &iov,
    .msg_iovlen = 1,
    .msg_control = msg_control,
    .msg_controllen = sizeof(msg_control)
    };

    if (recvmsg(fd, &msg, MSG_ERRQUEUE) == -1) {
    pr_err("recvmsg failed");
    return -1;
    }

    cmsg = CMSG_FIRSTHDR(&msg);
    while (cmsg != NULL) {
    serr = (void *) CMSG_DATA(cmsg);
    if (serr->ee_origin == SO_EE_ORIGIN_TXTIME) {
    tstamp = ((__u64) serr->ee_data << 32) + serr->ee_info;

    switch(serr->ee_code) {
    case SO_EE_CODE_TXTIME_INVALID_PARAM:
    fprintf(stderr, "packet with tstamp %llu dropped due to invalid params\n", tstamp);
    return 0;
    case SO_EE_CODE_TXTIME_MISSED:
    fprintf(stderr, "packet with tstamp %llu dropped due to missed deadline\n", tstamp);
    return 0;
    default:
    return -1;
    }
    }

    cmsg = CMSG_NXTHDR(&msg, cmsg);
    }

    return 0;
    }

    static int run_nanosleep(clockid_t clkid, int fd)
    {
    struct timespec ts;
    int cnt, err;
    __u64 txtime;
    struct pollfd p_fd = {
    .fd = fd,
    };

    memset(tx_buffer, MARKER, sizeof(tx_buffer));

    /* If no base-time was specified, start one to two seconds in the
    * future.
    */
    if (base_time == 0) {
    clock_gettime(clkid, &ts);
    ts.tv_sec += 1;
    ts.tv_nsec = ONE_SEC - waketx_delay;
    } else {
    ts.tv_sec = base_time / ONE_SEC;
    ts.tv_nsec = (base_time % ONE_SEC) - waketx_delay;
    }

    normalize(&ts);

    txtime = ts.tv_sec * ONE_SEC + ts.tv_nsec;
    txtime += waketx_delay;

    fprintf(stderr, "\ntxtime of 1st packet is: %llu", txtime);

    while (running) {
    memcpy(tx_buffer, &txtime, sizeof(__u64));
    err = clock_nanosleep(clkid, TIMER_ABSTIME, &ts, NULL);
    switch (err) {
    case 0:
    cnt = l2_send(fd, tx_buffer, sizeof(tx_buffer), txtime);
    if (cnt != sizeof(tx_buffer)) {
    pr_err("send failed");
    }
    ts.tv_nsec += period_nsec;
    normalize(&ts);
    txtime += period_nsec;

    /* Check if errors are pending on the error queue. */
    err = poll(&p_fd, 1, 0);
    if (err == 1 && p_fd.revents & POLLERR) {
    if (!process_socket_error_queue(fd))
    return -ECANCELED;
    }

    break;
    case EINTR:
    continue;
    default:
    fprintf(stderr, "clock_nanosleep returned %d: %s",
    err, strerror(err));
    return err;
    }
    }

    return 0;
    }

    static int set_realtime(pthread_t thread, int priority, int cpu)
    {
    cpu_set_t cpuset;
    struct sched_param sp;
    int err, policy;

    int min = sched_get_priority_min(SCHED_FIFO);
    int max = sched_get_priority_max(SCHED_FIFO);

    fprintf(stderr, "min %d max %d\n", min, max);

    if (priority < 0) {
    return 0;
    }

    err = pthread_getschedparam(thread, &policy, &sp);
    if (err) {
    fprintf(stderr, "pthread_getschedparam: %s\n", strerror(err));
    return -1;
    }

    sp.sched_priority = priority;

    err = pthread_setschedparam(thread, SCHED_FIFO, &sp);
    if (err) {
    fprintf(stderr, "pthread_setschedparam: %s\n", strerror(err));
    return -1;
    }

    if (cpu < 0) {
    return 0;
    }
    CPU_ZERO(&cpuset);
    CPU_SET(cpu, &cpuset);
    err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
    if (err) {
    fprintf(stderr, "pthread_setaffinity_np: %s\n", strerror(err));
    return -1;
    }

    return 0;
    }

    static void usage(char *progname)
    {
    fprintf(stderr,
    "\n"
    "usage: %s [options]\n"
    "\n"
    " -c [num] run on CPU 'num'\n"
    " -d [num] delta from wake up to txtime in nanoseconds (default %d)\n"
    " -h prints this message and exits\n"
    " -i [name] use network interface 'name'\n"
    " -p [num] run with RT priorty 'num'\n"
    " -P [num] period in nanoseconds (default %d)\n"
    " -s do not use SO_TXTIME\n"
    " -t [num] set SO_PRIORITY to 'num' (default %d)\n"
    " -D set deadline mode for SO_TXTIME\n"
    " -E enable error reporting on the socket error queue for SO_TXTIME\n"
    " -b [tstamp] txtime of 1st packet as a 64bit [tstamp]. Default: now + ~2seconds\n"
    " -m [mac_addr] dst MAC address\n"
    "\n",
    progname, DEFAULT_DELAY, DEFAULT_PERIOD, DEFAULT_PRIORITY);
    }

    int main(int argc, char *argv[])
    {
    int c, cpu = -1, err, fd, priority = -1;
    clockid_t clkid = CLOCK_TAI;
    char *iface = NULL, *progname;

    /* Process the command line arguments. */
    progname = strrchr(argv[0], '/');
    progname = progname ? 1 + progname : argv[0];
    while (EOF != (c = getopt(argc, argv, "c:d:hi:p:P:st:DEb:m:"))) {
    switch (c) {
    case 'c':
    cpu = atoi(optarg);
    break;
    case 'd':
    waketx_delay = atoi(optarg);
    break;
    case 'h':
    usage(progname);
    return 0;
    case 'i':
    iface = optarg;
    break;
    case 'p':
    priority = atoi(optarg);
    break;
    case 'P':
    period_nsec = atoi(optarg);
    break;
    case 's':
    use_so_txtime = 0;
    break;
    case 't':
    so_priority = atoi(optarg);
    break;
    case 'D':
    use_deadline_mode = SOF_TXTIME_DEADLINE_MODE;
    break;
    case 'E':
    receive_errors = SOF_TXTIME_REPORT_ERRORS;
    break;
    case 'b':
    base_time = atoll(optarg);
    break;
    case 'm':
    err = sscanf(optarg, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
    &mac_addr[0], &mac_addr[1], &mac_addr[2],
    &mac_addr[3], &mac_addr[4], &mac_addr[5]);
    if (err != 6) {
    printf("Invalid MAC address\n");
    return -1;
    }

    break;
    case '?':
    usage(progname);
    return -1;
    }
    }

    if (mac_addr[0] == 0 && mac_addr[1] == 0 && mac_addr[2] == 0) {
    pr_err("Destination MAC Address must be specified.");
    usage(progname);
    return -1;
    }

    if (waketx_delay > 999999999 || waketx_delay < 0) {
    pr_err("Bad wake up to transmission delay.");
    usage(progname);
    return -1;
    }

    if (period_nsec < 1000) {
    pr_err("Bad period.");
    usage(progname);
    return -1;
    }

    if (!iface) {
    pr_err("Need a network interface.");
    usage(progname);
    return -1;
    }

    if (set_realtime(pthread_self(), priority, cpu)) {
    return -1;
    }

    fd = l2_open_socket(iface, clkid);
    if (fd < 0) {
    return -1;
    }

    err = run_nanosleep(clkid, fd);

    close(fd);
    return err;
    }
  9. Jesus Sanchez-Palencia revised this gist Sep 18, 2018. 2 changed files with 11 additions and 11 deletions.
    4 changes: 2 additions & 2 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -108,12 +108,12 @@ qdiscs and the applications.
    i=$((`date +%s%N` + 37000000000 + (2 * 60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000))) ; \
    tc qdisc add dev enp3s0 parent root handle 100 taprio num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 0 1 2 2 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 \
    sched-file gates.sched base-time $base clockid CLOCK_TAI

    We can read the above as:
    - there are 3 traffic classes (num_tc 3);
    - SO_PRIORITY value 3 maps to TC 0, while value 2 maps to TC 1.
    - SO_PRIORITY value 3 maps to TC 0, while value 2 maps to TC 1.
    Everything else maps to the other (best-effort) traffic classes;
    - "queues 0 1 2 2" is a positional argument, meaning that TC 0 maps
    to queue 0, TC 1 maps to queue 1 and TC 2 maps to queues 2 and 3.
    18 changes: 9 additions & 9 deletions txtime_offset_stats.py
    Original file line number Diff line number Diff line change
    @@ -59,15 +59,15 @@ def compute_offsets_stats(file_path):
    total_sqr_dist += delta * new_delta

    if count != 0.0:
    variance = total_sqr_dist / (count - 1)
    std_dev = math.sqrt(variance)

    print("min:\t\t%e" % min_t)
    print("max:\t\t%e" % max_t)
    print("jitter (pk-pk):\t%e" % (max_t - min_t))
    print("avg:\t\t%e" % mean)
    print("std dev:\t%e" % std_dev)
    print("count:\t\t%d" % count)
    variance = total_sqr_dist / (count - 1)
    std_dev = math.sqrt(variance)

    print("min:\t\t%e" % min_t)
    print("max:\t\t%e" % max_t)
    print("jitter (pk-pk):\t%e" % (max_t - min_t))
    print("avg:\t\t%e" % mean)
    print("std dev:\t%e" % std_dev)
    print("count:\t\t%d" % count)


    def main():
  10. Jesus Sanchez-Palencia revised this gist Jul 3, 2018. 1 changed file with 20 additions and 7 deletions.
    27 changes: 20 additions & 7 deletions udp_tai.c
    Original file line number Diff line number Diff line change
    @@ -61,20 +61,33 @@
    #define SCM_TXTIME SO_TXTIME
    #endif

    #ifndef SO_EE_CODE_TXTIME_INVALID_PARAM
    #define SO_EE_CODE_TXTIME_INVALID_PARAM 2
    #define SO_EE_CODE_TXTIME_MISSED 3
    #ifndef SO_EE_ORIGIN_TXTIME
    #define SO_EE_ORIGIN_TXTIME 6
    #define SO_EE_CODE_TXTIME_INVALID_PARAM 1
    #define SO_EE_CODE_TXTIME_MISSED 2
    #endif

    #define pr_err(s) fprintf(stderr, s "\n")
    #define pr_info(s) fprintf(stdout, s "\n")

    /* The parameter for SO_TXTIME is the below struct. */
    /* The API for SO_TXTIME is the below struct and enum, which will be
    * provided by uapi/linux/net_tstamp.h in the near future.
    */
    struct sock_txtime {
    clockid_t clockid;
    uint16_t flags;
    };

    enum txtime_flags {
    SOF_TXTIME_DEADLINE_MODE = (1 << 0),
    SOF_TXTIME_REPORT_ERRORS = (1 << 1),

    SOF_TXTIME_FLAGS_LAST = SOF_TXTIME_REPORT_ERRORS,
    SOF_TXTIME_FLAGS_MASK = (SOF_TXTIME_FLAGS_LAST - 1) |
    SOF_TXTIME_FLAGS_LAST
    };


    static int running = 1, use_so_txtime = 1;
    static int period_nsec = DEFAULT_PERIOD;
    static int waketx_delay = DEFAULT_DELAY;
    @@ -294,7 +307,7 @@ static int process_socket_error_queue(int fd)
    cmsg = CMSG_FIRSTHDR(&msg);
    while (cmsg != NULL) {
    serr = (void *) CMSG_DATA(cmsg);
    if (serr->ee_origin == SO_EE_ORIGIN_LOCAL) {
    if (serr->ee_origin == SO_EE_ORIGIN_TXTIME) {
    tstamp = ((__u64) serr->ee_data << 32) + serr->ee_info;

    switch(serr->ee_code) {
    @@ -479,10 +492,10 @@ int main(int argc, char *argv[])
    so_priority = atoi(optarg);
    break;
    case 'D':
    use_deadline_mode = (1 << 0);
    use_deadline_mode = SOF_TXTIME_DEADLINE_MODE;
    break;
    case 'E':
    receive_errors = (1 << 1);
    receive_errors = SOF_TXTIME_REPORT_ERRORS;
    break;
    case 'b':
    base_time = atoll(optarg);
  11. Jesus Sanchez-Palencia revised this gist Jun 26, 2018. 1 changed file with 0 additions and 0 deletions.
    Empty file added Scheduled Tx Tools
    Empty file.
  12. Jesus Sanchez-Palencia revised this gist Jun 26, 2018. 4 changed files with 19 additions and 19 deletions.
    8 changes: 4 additions & 4 deletions README
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,5 @@
    Here we provide a testing application and scripts that can be used
    to exercise the SO_TXTIME APIs, the tbs qdisc and the taprio qdisc.
    to exercise the SO_TXTIME APIs, the etf qdisc and the taprio qdisc.

    The example is based on a sample application (udp_tai.c) provided by
    Richard Cochran as part of the RFC v1 of SO_TXTIME. We've extended
    @@ -9,12 +9,12 @@ scheduler, and a combination of those.

    The documentation is split into 2 README files:

    - README.tbs: Provides instructions for how to setup an example to
    use tbs standalone. In other words, only Time-based
    - README.etf: Provides instructions for how to setup an example to
    use etf standalone. In other words, only Time-based
    transmission is used.

    - README.taprio: Provides instructions for how to setup an example
    to use tbs and taprio together. That means using
    to use etf and taprio together. That means using
    a Time-aware scheduler (i.e. 802.1Qbv) in conjunction
    time-based transmission for fine-grained control over
    the Tx time of packets.
    8 changes: 4 additions & 4 deletions README.tbs → README.etf
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,5 @@
    Here we present the steps taken for setting up a test that uses *only*
    the TBS qdisc. That means that only Time-based transmission is exercised.
    the ETF qdisc. That means that only Time-based transmission is exercised.

    The 'talker' side of the example described below will transmit a packet
    every 1ms. The packet's txtime is set through the SO_TXTIME api, and is
    @@ -52,9 +52,9 @@ Our DUT uses an Intel i210 NIC, and our setup here is as follows.
    num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 2@2 hw 0

    1.b) Then we setup tbs with the desired config:
    e.g.: $ sudo tc qdisc add dev enp2s0 parent 8001:1 tbs \
    offload clockid CLOCK_TAI sorting delta 150000
    1.b) Then we setup etf with the desired config:
    e.g.: $ sudo tc qdisc add dev enp2s0 parent 8001:1 etf \
    offload clockid CLOCK_TAI delta 150000



    16 changes: 8 additions & 8 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,5 @@
    Here we present the steps taken for setting up a test that uses both
    the TBS qdisc and the TAPRIO one. That means that we'll use a (Qbv-like)
    the ETF qdisc and the TAPRIO one. That means that we'll use a (Qbv-like)
    port scheduler with a fixed Tx schedule for traffic classes (TC), while
    using Time-based transmission for controlling the Tx time of packets within
    each TC.
    @@ -20,7 +20,7 @@ time-slice.

    The application enqueueing packets for TC 1 will set its packets *deadline*
    with an offset of 250us within its time-slice. However, because this TC is
    using the deadline mode of SO_TXTIME + tbs, then a packet maybe transmitted
    using the deadline mode of SO_TXTIME + etf, then a packet maybe transmitted
    at anytime within its time slice that is before its deadline.

    Best-effort traffic is transmitted at anytime during the third time slice.
    @@ -121,16 +121,16 @@ qdiscs and the applications.
    - the reference clock is CLOCK_TAI;


    2.2) Setup tbs for queue TC 0:
    2.2) Setup etf for queue TC 0:

    tc qdisc replace dev enp3s0 parent 100:1 tbs clockid CLOCK_TAI \
    delta 200000 sorting offload
    tc qdisc replace dev enp3s0 parent 100:1 etf clockid CLOCK_TAI \
    delta 200000 offload


    2.3) Setup tbs in deadline mode for TC 1:
    2.3) Setup etf in deadline mode for TC 1:

    tc qdisc replace dev enp3s0 parent 100:2 tbs clockid CLOCK_TAI \
    delta 200000 sorting offload deadline
    tc qdisc replace dev enp3s0 parent 100:2 etf clockid CLOCK_TAI \
    delta 200000 offload deadline_mode


    3) Start time sync (ptp slave):
    6 changes: 3 additions & 3 deletions setup_clock_sync.sh
    Original file line number Diff line number Diff line change
    @@ -42,13 +42,13 @@ fi
    # is also running one end of the TSN application (either the listener or the
    # talker), which requires the local clocks to be synchronized.
    #
    # When that isn't the case (i.e. the tbs experiment, in which all we care
    # When that isn't the case (i.e. the etf experiment, in which all we care
    # about is the network clock sync), then just start this script with -m
    # instead so phc2sys is not used and the jitter of the network clock sync is
    # not affected.
    #
    setup_ptp_master() {
    ptp4l -i $INTERFACE $PTP4L_VERBOSE &
    ptp4l -2 -i $INTERFACE $PTP4L_VERBOSE &
    }

    setup_ptp_master_and_sync() {
    @@ -61,7 +61,7 @@ setup_ptp_master_and_sync() {
    # then synchronize the system clock to the PHC.
    setup_ptp_slave() {
    phc2sys -a -r $PHC2SYS_VERBOSE &
    ptp4l -s -i $INTERFACE $PTP4L_VERBOSE &
    ptp4l -2 -s -i $INTERFACE $PTP4L_VERBOSE &
    }


  13. @jeez jeez created this gist Jun 8, 2018.
    15 changes: 15 additions & 0 deletions Makefile
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,15 @@
    PCAP_CFLAGS=$(shell pcap-config --cflags --libs)

    all: dump-classifier udp_tai

    dump-classifier: dump-classifier.c
    ${CC} ${CFLAGS} $(PCAP_CFLAGS) -o $@ $<

    udp_tai: udp_tai.c
    ${CC} ${CFLAGS} -lpthread -o $@ $<

    clean:
    @rm dump-classifier
    @rm udp_tai

    .PHONY: clean debug
    24 changes: 24 additions & 0 deletions README
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,24 @@
    Here we provide a testing application and scripts that can be used
    to exercise the SO_TXTIME APIs, the tbs qdisc and the taprio qdisc.

    The example is based on a sample application (udp_tai.c) provided by
    Richard Cochran as part of the RFC v1 of SO_TXTIME. We've extended
    it in several ways so it may be used as an example of different
    setups: per-packet Tx time only based systems, per-port Time-aware
    scheduler, and a combination of those.

    The documentation is split into 2 README files:

    - README.tbs: Provides instructions for how to setup an example to
    use tbs standalone. In other words, only Time-based
    transmission is used.

    - README.taprio: Provides instructions for how to setup an example
    to use tbs and taprio together. That means using
    a Time-aware scheduler (i.e. 802.1Qbv) in conjunction
    time-based transmission for fine-grained control over
    the Tx time of packets.

    A custom tool known as 'dump-classifier' was developed so we can
    verify if a taprio schedule is being respected. For more information
    please check README.classifier .
    63 changes: 63 additions & 0 deletions README.classifier
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,63 @@
    To help analyze taprio scheduling characteristics, we've developed a custom
    tool called 'dump-classifier'.


    dump-classifier
    ===============

    dump-classifier aims to ease the test/verification of how well an
    implementation runs 802.1Qbv-like schedules.


    How to compile
    --------------

    * Dependencies:

    - libpcap-dev


    Just running 'make' should work, if all the dependencies are met:

    $ make


    How to run
    ----------

    $ ./dump-classifier -s <SCHED FILE> -f <FILTER FILE> -b <BASE TIME> -d <DUMP FILE>

    <SCHED FILE> is a text file containg the traffic schedule, the format
    is exactly the same as taprio (the qdisc) accepts.

    Example:
    -----<cut
    S 01 500000
    S 02 500000
    ----->end

    <FILTER FILE> allows different traffic classes to be indentified in a
    pcap dump file, it has the following format is contains a traffic
    class name and a pcap expression on each line, any traffic class that
    doesn't have a filter associated will be classified as "BE" (best
    effort). The order is important, as the first line will match the
    first traffic class (bit 0) in the gatemask parameter (the second
    field of each line of the schedule file), the second line will match
    the second traffic class (bit 1), and so on.

    Example:
    -----<cut
    talker :: ether dst aa:aa:aa:aa:aa:aa
    ----->end

    <BASE TIME> an absolute time in nanoseconds where the schedule
    started, if that time is before the timestamp of the first packet in
    the <DUMP FILE>, the schedule will run until it reaches that
    timestamp, packets that have a timestamp before basetime will be
    ignored.

    <DUMP FILE> is a dump file captured via tcpdump, with timestamp
    precision in nanoseconds, so captured using something like this:

    $ tcpdump -j adapter_unsynced --time-stamp-precision=nanos -i enp2s0 -w dump.pcap

    162 changes: 162 additions & 0 deletions README.taprio
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,162 @@
    Here we present the steps taken for setting up a test that uses both
    the TBS qdisc and the TAPRIO one. That means that we'll use a (Qbv-like)
    port scheduler with a fixed Tx schedule for traffic classes (TC), while
    using Time-based transmission for controlling the Tx time of packets within
    each TC.

    The 'talker' side of the example described below will have 2 applications
    transmitting time-sensitive traffic following a strict cyclic schedule.
    In addition to that, iperf3 is used to transmit best-effort traffic on the
    port. The port schedule is thus comprised by 3 time-slices, with a total
    cycle-time of 1 millisecond allocated as:

    - Traffic Class 0 (TC 0): duration of 300us, 'strict txtime' is used.
    - Traffic Class 1 (TC 1): duration of 300us, 'deadline txtime' is used.
    - Traffic Class 2 (TC 2): duration of 400us, best-effort traffic.

    The system is configured so the application enqueueing packets for
    TC 0 will set its packets *Tx time* with an offset of 250us within its
    time-slice.

    The application enqueueing packets for TC 1 will set its packets *deadline*
    with an offset of 250us within its time-slice. However, because this TC is
    using the deadline mode of SO_TXTIME + tbs, then a packet maybe transmitted
    at anytime within its time slice that is before its deadline.

    Best-effort traffic is transmitted at anytime during the third time slice.

    A away to visualize this cycle and its time-slices is:

    |______x_|......D_|bbbbbbbbbb|
    0 299 599 999us


    The application for each time-sensitive traffic class will transmit a packet
    every 1ms. The packet's txtime is set through the SO_TXTIME api, and is
    copied into the packet's payload.

    At the 'listener' side, we capture traffic and then post-process it to
    verify if packets are arriving outside of the time-slice they belong to.

    ptp4l is used for synchronizing the PHC clocks over the network and
    phc2sys is used on the 'talker' size for synchronizing the system
    clock to the PHC.

    CLOCK_TAI is the reference clockid used throughout the example for the
    qdiscs and the applications.


    # LISTENER #
    1) Setup network
    sudo ip addr add 192.168.0.78/4 broadcast 192.168.0.255 dev enp3s0


    2) Start time sync (ptp master)
    sudo ./setup_clock_sync.sh -i enp3s0 -m -v


    3) Start iperf server
    iperf3 -s


    4) Prepare 'dump-classifier' files. Please refer to
    README.classifier for further information.

    --filters--
    talker_strict :: udp port 7788
    talker_deadline :: udp port 7798

    --gates.sched--
    S 0x1 300000
    S 0x2 300000
    S 0xc 400000


    5) Start capturing traffic:

    sudo tcpdump -c 600000 -i enp3s0 -w tmp.pcap -j adapter_unsynced \
    -tt --time-stamp-precision=nano

    6) Use the talkers to transmit packets as described on the next section.


    6) After traffic was captured, check if packets arrived outside of their
    time-slices. The base-time comes from the udp_tai for TC 0 minus the
    250us txtime offset as used below. For example:

    ./dump-classifier -b 1528320726000000000 -d tmp.pcap -f filter \
    -s gates.sched | grep -v ontime



    # TALKER #
    1) Setup network
    sudo ip addr add 192.168.0.77/4 broadcast 192.168.0.255 dev enp3s0

    2) Setup qdiscs

    2.0) Prepare sched file for taprio
    --gates.sched--
    S 0x1 300000
    S 0x2 300000
    S 0xc 400000


    2.1) Setup taprio with a base-time starting in 2min from now rounded down.
    We must add the 37s UTC-TAI offset to the timestamp we get with 'date'.

    i=$((`date +%s%N` + 37000000000 + (2 * 60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000))) ; \
    tc qdisc add dev enp3s0 parent root handle 100 taprio num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 0 1 2 2 \
    sched-file gates.sched base-time $base clockid CLOCK_TAI

    We can read the above as:
    - there are 3 traffic classes (num_tc 3);
    - SO_PRIORITY value 3 maps to TC 0, while value 2 maps to TC 1.
    Everything else maps to the other (best-effort) traffic classes;
    - "queues 0 1 2 2" is a positional argument, meaning that TC 0 maps
    to queue 0, TC 1 maps to queue 1 and TC 2 maps to queues 2 and 3.
    - gates.sched is used as schedule file;
    - the reference clock is CLOCK_TAI;


    2.2) Setup tbs for queue TC 0:

    tc qdisc replace dev enp3s0 parent 100:1 tbs clockid CLOCK_TAI \
    delta 200000 sorting offload


    2.3) Setup tbs in deadline mode for TC 1:

    tc qdisc replace dev enp3s0 parent 100:2 tbs clockid CLOCK_TAI \
    delta 200000 sorting offload deadline


    3) Start time sync (ptp slave):
    sudo ./setup_clock_sync.sh -i enp3s0 -s -v


    4) Start iperf3 client:
    iperf3 -c 192.168.0.78 -t 600 --fq-rate 100M


    5) Start udp_tai for TC 0. Use a base-time starting in 1min from now + a
    250us offset for txtime:

    now=`date +%s%N` ; i=$(($now + 37000000000 + (60 * 1000000000))) ; \
    base=$(($i - ($i % 1000000000) + 250000)) ; \
    sudo ./udp_tai -i enp3s0 -b $base -P 1000000 -t 3 -p 90 -d 600000 \
    -u 7788


    6) Start udp_tai in deadline mode for TC 1. Use the txtime computed for
    the previous traffic class (above) and add 300us so it falls under the
    second time slice (TC 1). For example, if the instance of udp_tai executed
    on the previous step printed
    "txtime of 1st packet is: 1528320726000250000", then the now we should do:

    sudo ./udp_tai -i enp3s0 -t 2 -p 90 -D -d 600000 \
    -b 1528320726000550000 -u 7798


    112 changes: 112 additions & 0 deletions README.tbs
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,112 @@
    Here we present the steps taken for setting up a test that uses *only*
    the TBS qdisc. That means that only Time-based transmission is exercised.

    The 'talker' side of the example described below will transmit a packet
    every 1ms. The packet's txtime is set through the SO_TXTIME api, and is
    copied into the packet's payload.

    At the 'listener' side, we capture traffic and then post-process it to
    compute the delta between each packet's arrival time and their txtime.

    ptp4l is used for synchronizing the PHC clocks over the network and
    phc2sys is used on the 'talker' size for synchronizing the system
    clock to the PHC.

    CLOCK_TAI is the reference clockid used throughout the example for the
    qdiscs and the applications.


    # LISTENER #

    1) Setup the PTP master. If using the listener end point as PTP
    master, setup_clock_sync.sh can be used as the below.

    e.g.: $ sudo ip addr add 192.168.0.78/4 broadcast 192.168.0.255 dev IFACE
    $ sudo ./setup_clock_sync.sh -i IFACE -m -v

    This script will start ptp4l so the PHC time is propagated to the
    network. The system clock and the PHC are NOT synchronized on that mode.

    * Note that the TAI offset is applied, so CLOCK_REALTIME will be in
    the UTC scale while CLOCK_TAI will be in the TAI scale, just like
    the PHC.



    2) Start capturing traffic on the listener end point. If we want to capture
    traffic for 1 minute, and are expecting 1 packet per milisecond:

    e.g.: $ sudo tcpdump -c 60000 -i enp3s0 -w tmp.pcap \
    -j adapter_unsynced -tt --time-stamp-precision=nano \
    udp port 7788



    # TALKER #

    3) Configure the Qdiscs on the talker side (Device Under Testing, DUT).
    Our DUT uses an Intel i210 NIC, and our setup here is as follows.

    1.a) First, we setup mqprio as the root qdisc:
    e.g.: $ sudo tc qdisc replace dev IFACE parent root mqprio \
    num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 2@2 hw 0

    1.b) Then we setup tbs with the desired config:
    e.g.: $ sudo tc qdisc add dev enp2s0 parent 8001:1 tbs \
    offload clockid CLOCK_TAI sorting delta 150000



    4) Setup the Device Under Testing (DUT) as PTP slave and synchronize
    the local clocks.

    e.g.: $ sudo ip addr add 192.168.0.77/4 broadcast 192.168.0.255 dev IFACE
    $ sudo ./setup_clock_sync.sh -i IFACE -s -v

    This script will start ptp4l so the PHC is synchronized to the PTP master,
    and then will synchronize the system clock to PHC using phc2sys.
    At this stage, based purely on empirical observations, one recommendation
    is waiting for the rms value reported by ptp4l to reach a value below 15 ns,
    and to remain somewhat constant after that.

    * Note that the TAI offset is applied, so CLOCK_REALTIME will be in the UTC
    scale while CLOCK_TAI will be in the TAI scale, just like the PHC.



    5) Optionally, build and run check_clocks on both PTP master and slave.

    e.g.: $ make check_clocks && sudo ./check_clocks IFACE

    It reports the timestamps fetched from CLOCK_REALTIME, CLOCK_TAI and
    the interface's PHC, as well the latency for reading from each clock
    and the delta between the PHC and the system clocks.
    You may use this information to verify if the offsets were applied
    correctly and if the PHC - CLOCK_TAI delta is not too high. Again,
    based on empirical observations, we consider this value as "good enough"
    if it's less than 25us, and it's been observed to get as low as 4us.



    6) Build and run udp_tai on the talker end station

    e.g.: $ gcc -o udp_tai -lpthread udp_tai.c
    $ sudo ./udp_tai -i enp2s0 -P 1000000 -p 90 -d 600000




    # LISTENER #

    7) Analyze traffic and generate statistics.
    We first use tshark for post-processing the pcap file as needed, then
    we use a custom python script to compute the packets' offset from their
    expected arrival time, and then compute statistics for the overall data set.

    e.g.: $ tshark -r tmp.pcap --disable-protocol dcp-etsi --disable-protocol \
    dcp-pft -t e -E separator=, -T fields -e frame.number \
    -e frame.time_epoch -e data.data > tmp.out

    $ ./txtime_offset_stats.py -f tmp.out

    122 changes: 122 additions & 0 deletions check_clocks.c
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,122 @@
    /*
    * Copyright (c) 2018, Intel Corporation
    *
    * SPDX-License-Identifier: BSD-3-Clause
    *
    */

    #include <fcntl.h>
    #include <stdint.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <time.h>
    #include <unistd.h>
    #include <linux/ethtool.h>
    #include <linux/sockios.h>
    #include <net/if.h>
    #include <sys/ioctl.h>

    #define ONE_SEC 1000000000ULL
    #define PTP_MAX_DEV_PATH 16

    /* fd to clockid helpers. Copied from posix-timers.h. */
    #define CLOCKFD 3
    static inline clockid_t make_process_cpuclock(const unsigned int pid,
    const clockid_t clock)
    {
    return ((~pid) << 3) | clock;
    }

    static inline clockid_t fd_to_clockid(const int fd)
    {
    return make_process_cpuclock((unsigned int) fd, CLOCKFD);
    }

    static inline void open_phc_fd(int* fd_ptp, char* ifname)
    {
    struct ethtool_ts_info interface_info = {0};
    char ptp_path[PTP_MAX_DEV_PATH];
    struct ifreq req = {0};
    int fd_ioctl;

    /* Get PHC index */
    interface_info.cmd = ETHTOOL_GET_TS_INFO;
    snprintf(req.ifr_name, sizeof(req.ifr_name), "%s", ifname);
    req.ifr_data = (char *) &interface_info;

    fd_ioctl = socket(AF_INET, SOCK_DGRAM, 0);
    if (fd_ioctl < 0) {
    perror("Couldn't open socket");
    exit(EXIT_FAILURE);
    }

    if (ioctl(fd_ioctl, SIOCETHTOOL, &req) < 0) {
    perror("Couldn't issue SIOCETHTOOL ioctl");
    exit(EXIT_FAILURE);
    }

    snprintf(ptp_path, sizeof(ptp_path), "%s%d", "/dev/ptp",
    interface_info.phc_index);

    *fd_ptp = open(ptp_path, O_RDONLY);
    if (*fd_ptp < 0) {
    perror("Couldn't open the PTP fd. Did you forget to run with sudo again?");
    exit(EXIT_FAILURE);
    }

    close(fd_ioctl);
    }

    int main(int argc, char** argv)
    {
    struct timespec ts_rt1, ts_rt2, ts_ptp1, ts_ptp2, ts_tai1, ts_tai2;
    uint64_t rt, tai, ptp, lat_rt, lat_tai, lat_ptp;
    char ifname[IFNAMSIZ];
    int fd_ptp, err;

    if (argc < 2) {
    printf("You must run this as %s NET_IFACE (e.g. enp2s0)\n", argv[0]);
    return EXIT_FAILURE;
    }

    strncpy(ifname, argv[1], sizeof(ifname) - 1);
    open_phc_fd(&fd_ptp, ifname);

    /* Fetch timestamps for each clock. */
    clock_gettime(CLOCK_REALTIME, &ts_rt1);
    clock_gettime(CLOCK_TAI, &ts_tai1);
    clock_gettime(fd_to_clockid(fd_ptp), &ts_ptp1);
    rt = (ts_rt1.tv_sec * ONE_SEC) + ts_rt1.tv_nsec;
    tai = (ts_tai1.tv_sec * ONE_SEC) + ts_tai1.tv_nsec;
    ptp = (ts_ptp1.tv_sec * ONE_SEC) + ts_ptp1.tv_nsec;

    /* Compute clocks read latency. */
    clock_gettime(CLOCK_REALTIME, &ts_rt1);
    clock_gettime(CLOCK_REALTIME, &ts_rt2);
    lat_rt = ((ts_rt2.tv_sec * ONE_SEC) + ts_rt2.tv_nsec)
    - ((ts_rt1.tv_sec * ONE_SEC) + ts_rt1.tv_nsec);

    clock_gettime(CLOCK_TAI, &ts_tai1);
    clock_gettime(CLOCK_TAI, &ts_tai2);
    lat_tai = ((ts_tai2.tv_sec * ONE_SEC) + ts_tai2.tv_nsec)
    - ((ts_tai1.tv_sec * ONE_SEC) + ts_tai1.tv_nsec);

    clock_gettime(fd_to_clockid(fd_ptp), &ts_ptp1);
    clock_gettime(fd_to_clockid(fd_ptp), &ts_ptp2);
    lat_ptp = ((ts_ptp2.tv_sec * ONE_SEC) + ts_ptp2.tv_nsec)
    - ((ts_ptp1.tv_sec * ONE_SEC) + ts_ptp1.tv_nsec);

    printf("rt tstamp:\t%llu\n", rt);
    printf("tai tstamp:\t%llu\n", tai);
    printf("phc tstamp:\t%llu\n", ptp);
    printf("rt latency:\t%llu\n", lat_rt);
    printf("tai latency:\t%llu\n", lat_tai);
    printf("phc latency:\t%llu\n", lat_ptp);
    printf("phc-rt delta:\t%llu\n", ptp - rt);
    printf("phc-tai delta:\t%llu\n", ptp - tai);

    close(fd_ptp);

    return EXIT_SUCCESS;
    }
    348 changes: 348 additions & 0 deletions dump-classifier.c
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,348 @@
    /*
    * Copyright (c) 2018, Intel Corporation
    *
    * SPDX-License-Identifier: BSD-3-Clause
    *
    */

    #include <argp.h>
    #include <inttypes.h>
    #include <pcap.h>
    #include <stdbool.h>
    #include <stdint.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    #define NSEC_TO_SEC 1e9

    #define NUM_FILTERS 8
    #define NUM_ENTRIES 64

    enum traffic_flags {
    TRAFFIC_FLAGS_TXTIME,
    };

    struct tc_filter {
    char *name;
    struct bpf_program prog;
    unsigned int flags;
    };

    struct sched_entry {
    uint8_t command;
    uint32_t gatemask;
    uint32_t interval;
    };

    struct schedule {
    struct sched_entry entries[NUM_ENTRIES];
    uint64_t base_time;
    size_t current_entry;
    size_t num_entries;
    uint64_t cycle_time;
    };

    static struct argp_option options[] = {
    {"sched-file", 's', "SCHED_FILE", 0, "File containing the schedule" },
    {"dump-file", 'd', "DUMP_FILE", 0, "File containing the tcpdump dump" },
    {"filters-file", 'f', "FILTERS_FILE", 0, "File containing the classfication filters" },
    {"base-time", 'b', "TIME", 0, "Timestamp indicating when the schedule starts" },
    { 0 }
    };

    static struct tc_filter traffic_filters[NUM_FILTERS];
    static FILE *sched_file, *dump_file, *filters_file;
    static struct schedule schedule;
    static uint64_t base_time;

    static error_t parser(int key, char *arg, struct argp_state *state)
    {
    switch (key) {
    case 'd':
    dump_file = fopen(arg, "r");
    if (!dump_file) {
    perror("Could not open file, fopen");
    exit(EXIT_FAILURE);
    }
    break;
    case 's':
    sched_file = fopen(arg, "r");
    if (!sched_file) {
    perror("Could not open file, fopen");
    exit(EXIT_FAILURE);
    }
    break;
    case 'f':
    filters_file = fopen(arg, "r");
    if (!filters_file) {
    perror("Could not open file, fopen");
    exit(EXIT_FAILURE);
    }
    break;
    case 'b':
    base_time = strtoull(arg, NULL, 0);
    break;
    }

    return 0;
    }

    static struct argp argp = { options, parser };

    static void usage(void)
    {
    fprintf(stderr, "dump-classifier -s <sched-file> -d <dump-file> -f <filters-file> -b <base-time>\n");
    }

    static int parse_schedule(FILE *file, struct schedule *schedule,
    size_t max_entries, uint64_t base_time)
    {
    uint32_t interval, gatemask;
    size_t i = 0;

    while (fscanf(file, "%*s %x %" PRIu32 "\n",
    &gatemask, &interval) != EOF) {
    struct sched_entry *entry;

    if (i >= max_entries)
    return -EINVAL;

    entry = &schedule->entries[i];

    entry->gatemask = gatemask;
    entry->interval = interval;

    i++;
    }

    schedule->base_time = base_time;
    schedule->current_entry = 0;
    schedule->num_entries = i;

    return i;

    }

    static int parse_filters(pcap_t *handle, FILE *file,
    struct tc_filter *filters, size_t num_filters)
    {
    char *name, *expression;
    size_t i = 0;
    int err;

    while (i < num_filters && fscanf(file, "%ms :: %m[^\n]s\n",
    &name, &expression) != EOF) {
    struct tc_filter *filter = &filters[i];

    filter->name = name;

    err = pcap_compile(handle, &filter->prog, expression,
    1, PCAP_NETMASK_UNKNOWN);
    if (err < 0) {
    pcap_perror(handle, "pcap_compile");
    return -EINVAL;
    }

    i++;
    }

    return i;
    }

    /* libpcap re-uses the timeval struct for nanosecond resolution when
    * PCAP_TSTAMP_PRECISION_NANO is specified.
    */
    static uint64_t tv_to_nanos(const struct timeval *tv)
    {
    return tv->tv_sec * NSEC_TO_SEC + tv->tv_usec;
    }

    static struct sched_entry *next_entry(struct schedule *schedule)
    {
    schedule->current_entry++;

    if (schedule->current_entry >= schedule->num_entries)
    schedule->current_entry = 0;

    return &schedule->entries[schedule->current_entry];
    }

    static struct sched_entry *first_entry(struct schedule *schedule)
    {
    schedule->current_entry = 0;

    return &schedule->entries[0];
    }

    static struct sched_entry *advance_until(struct schedule *schedule,
    uint64_t ts, uint64_t *now)
    {
    struct sched_entry *first, *entry;
    uint64_t cycle = 0;
    uint64_t n;

    entry = first = first_entry(schedule);

    if (!schedule->cycle_time) {
    do {
    cycle += entry->interval;
    entry = next_entry(schedule);
    } while (entry != first);

    schedule->cycle_time = cycle;
    }

    cycle = schedule->cycle_time;

    n = (ts - schedule->base_time) / cycle;
    *now = schedule->base_time + (n * cycle);

    do {
    if (*now + entry->interval > ts)
    break;

    *now += entry->interval;
    entry = next_entry(schedule);
    } while (true);

    return entry;
    }

    static int match_packet(const struct tc_filter *filters, int num_filters,
    const struct pcap_pkthdr *hdr,
    const uint8_t *frame)
    {
    int err;
    int i;

    for (i = 0; i < num_filters; i++) {
    const struct tc_filter *f = &filters[i];

    err = pcap_offline_filter(&f->prog, hdr, frame);
    if (!err) {
    /* The filter for traffic class 'i' doesn't
    * match the packet
    */
    continue;
    }

    return i;
    }

    /* returning 'num_filters' means that the packet matches none
    * of the filters, so it's a Best Effort packet.
    */
    return num_filters;
    }

    static int classify_frames(pcap_t *handle, const struct tc_filter *tc_filters,
    int num_filters, struct schedule *schedule)
    {
    struct sched_entry *entry;
    struct pcap_pkthdr *hdr;
    const uint8_t *frame;
    uint64_t now, ts;
    int err;

    now = schedule->base_time;

    /* Ignore frames until we get to the base_time of the
    * schedule. */
    do {
    err = pcap_next_ex(handle, &hdr, &frame);
    if (err < 0) {
    pcap_perror(handle, "pcap_next_ex");
    return -EINVAL;
    }

    ts = tv_to_nanos(&hdr->ts);
    } while (ts <= now);

    do {
    const char *name, *ontime;
    int64_t offset;
    int tc;

    ts = tv_to_nanos(&hdr->ts);

    entry = advance_until(schedule, ts, &now);

    tc = match_packet(tc_filters, num_filters, hdr, frame);

    if (tc < num_filters)
    name = tc_filters[tc].name;
    else
    name = "BE";

    if (entry->gatemask & (1 << tc))
    ontime = "ontime";
    else
    ontime = "late";

    offset = ts - now;

    /* XXX: what more information might we need? */
    printf("%" PRIu64 " %" PRIu64 " \"%s\" \"%s\" %" PRId64 " %#x\n",
    ts, now, name, ontime, offset, entry->gatemask);
    } while (pcap_next_ex(handle, &hdr, &frame) >= 0);

    return 0;
    }

    static void free_filters(struct tc_filter *filters, int num_filters)
    {
    int i;

    for (i = 0; i < num_filters; i++) {
    struct tc_filter *f = &filters[i];

    free(f->name);
    }
    }

    int main(int argc, char **argv)
    {
    char errbuf[PCAP_ERRBUF_SIZE];
    pcap_t *handle;
    int err, num;

    argp_parse(&argp, argc, argv, 0, NULL, NULL);

    if (!dump_file || !sched_file || !filters_file || !base_time) {
    usage();
    exit(EXIT_FAILURE);
    }

    err = parse_schedule(sched_file, &schedule, NUM_ENTRIES, base_time);
    if (err <= 0) {
    fprintf(stderr, "Could not parse schedule file (or file empty)\n");
    exit(EXIT_FAILURE);
    }

    handle = pcap_fopen_offline_with_tstamp_precision(
    dump_file, PCAP_TSTAMP_PRECISION_NANO, errbuf);
    if (!handle) {
    fprintf(stderr, "Could not parse dump file\n");
    exit(EXIT_FAILURE);
    }

    num = parse_filters(handle, filters_file,
    traffic_filters, NUM_FILTERS);
    if (err < 0) {
    fprintf(stderr, "Could not filters file\n");
    exit(EXIT_FAILURE);
    }

    err = classify_frames(handle, traffic_filters, num, &schedule);
    if (err < 0) {
    fprintf(stderr, "Could not classify frames\n");
    exit(EXIT_FAILURE);
    }

    free_filters(traffic_filters, num);

    pcap_close(handle);

    return 0;
    }
    144 changes: 144 additions & 0 deletions setup_clock_sync.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,144 @@
    #!/bin/bash
    #
    # Copyright (c) 2018, Intel Corporation
    #
    # SPDX-License-Identifier: BSD-3-Clause
    #

    #
    # DISCLAIMER
    #
    # This script is meant for testing purposes only.
    # It provides an oversimplified approach for having a simple PTP
    # network up and running, with each local node having its CLOCK_TAI
    # offset adjusted.
    #
    # TODO:
    # - find a way to fetch the TAI offset from ptp4l directly. Ivan
    # suggested using pmc for that.
    #

    set -e

    INTERFACE=none
    TAI_OFFSET=37
    PTP4L_VERBOSE=''
    PHC2SYS_VERBOSE=''
    if [ -z $PTP4L ]; then
    PTP4L=$(which ptp4l)
    fi
    if [ -z $PHC2SYS ]; then
    PHC2SYS=$(which phc2sys)
    fi


    # On the PTP master, if started with -M parameter, synchronize the
    # system clock to PHC first, then propagate that to network using ptp4l.
    # We trust that the system clock was initially setup correctly or adjusted
    # to some other source (i.e. NTP, GPS, etc).
    #
    # For this -M mode, clocks are kept synchronized by phc2sys.
    # This is provided for the scenarios in which the PTP master on this network
    # is also running one end of the TSN application (either the listener or the
    # talker), which requires the local clocks to be synchronized.
    #
    # When that isn't the case (i.e. the tbs experiment, in which all we care
    # about is the network clock sync), then just start this script with -m
    # instead so phc2sys is not used and the jitter of the network clock sync is
    # not affected.
    #
    setup_ptp_master() {
    ptp4l -i $INTERFACE $PTP4L_VERBOSE &
    }

    setup_ptp_master_and_sync() {
    phc2sys -c $INTERFACE -s CLOCK_REALTIME -w $PHC2SYS_VERBOSE &
    setup_ptp_master
    }


    # On PTP slaves, first synchronize the PHC to the PTP master,
    # then synchronize the system clock to the PHC.
    setup_ptp_slave() {
    phc2sys -a -r $PHC2SYS_VERBOSE &
    ptp4l -s -i $INTERFACE $PTP4L_VERBOSE &
    }


    # Use adjtimex to set the TAI offset to CLOCK_TAI.
    adjust_clock_tai_offset() {
    tmp_src=$(mktemp /tmp/XXXXXX.c)
    tmp_bin=$(mktemp)
    cat <<EOF > $tmp_src
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/timex.h>
    int main(void)
    {
    struct timex timex = {
    .modes = ADJ_TAI,
    .constant = $TAI_OFFSET
    };
    if (adjtimex(&timex) == -1) {
    perror("adjtimex failed to set CLOCK_TAI offset");
    return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
    }
    EOF

    gcc -o $tmp_bin $tmp_src
    $tmp_bin
    rm -f $tmp_bin $tmp_src
    }


    test_dependencies() {
    if [ ! -x $PTP4L ]; then
    echo "ptp4l must be available from your \$PATH or set \$PTP4L."
    exit -1
    fi
    if [ ! -x $PHC2SYS ]; then
    echo "phc2sys must be available from your \$PATH or set \$PHC2SYS."
    exit -1
    fi
    }


    test_dependencies

    ptp_master_mode=f
    while getopts "Mmsvi:" opt; do
    case ${opt} in
    i) INTERFACE=$OPTARG ;;
    m) ptp_master_mode=y ;;
    s) ptp_master_mode=n ;;
    M) ptp_master_mode=M ;;
    v) PTP4L_VERBOSE='-m --summary_interval=5' ;
    PHC2SYS_VERBOSE='-m -u 20' ;;
    *) exit -1 ;;
    esac
    done

    if [ ${INTERFACE} = none ]; then
    echo "You must set the network interface using '-i'."
    exit -1
    fi

    if [ ${ptp_master_mode} = y ]; then
    setup_ptp_master
    adjust_clock_tai_offset
    elif [ ${ptp_master_mode} = M ]; then
    setup_ptp_master_and_sync
    adjust_clock_tai_offset
    elif [ ${ptp_master_mode} = n ]; then
    setup_ptp_slave
    adjust_clock_tai_offset
    else
    echo "You must select PTP master (-m) OR PTP slave (-s) mode."
    exit -1
    fi

    91 changes: 91 additions & 0 deletions txtime_offset_stats.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,91 @@
    #!/usr/bin/env python3
    #
    # Copyright (c) 2018, Intel Corporation
    #
    # SPDX-License-Identifier: BSD-3-Clause
    #

    # Expected input file format is a CSV file with:
    #
    # <FRAME_NUMBER, FRAME_ARRIVAL_TIME, FRAME_PAYLOAD_BYTES>
    # E.g.:
    # 1,1521534608.000000456,00:38:89:bd:a1:93:1d:15:(...)
    # 2,1521534608.001000480,00:38:89:bd:a1:93:1d:15:(...)
    #
    # Frame number: sequence number for each frame
    # Frame arrival time: Rx HW timestamp for each frame
    # Frame Payload: payload starting with 64bit timestamp (txtime)
    #
    # This can be easily generated with tshark with the following command line:
    # $ tshark -r CAPTURE.pcap -t e -E separator=, -T fields -e frame.number \
    # -e frame.time_epoch \
    # -e data.data > DATA.out
    #
    import argparse
    import csv
    import struct
    import math
    import sys

    # TAI to UTC offset. Currently that is 37 seconds.
    TAI_OFFSET = 37000000000


    def compute_offsets_stats(file_path):
    with open(file_path) as f:
    count = mean = total_sqr_dist = 0.0
    min_t = sys.maxsize
    max_t = -sys.maxsize

    for line in csv.reader(f):
    arrival_tstamp = int(line[1].replace('.', ''))
    data = line[2].split(':')
    txtime = ''.join(data[0:8])
    txtime = bytearray.fromhex(txtime)
    txtime = struct.unpack('<Q', txtime)

    val = float(arrival_tstamp - txtime[0])
    val = (val - TAI_OFFSET) if val > TAI_OFFSET else val

    # Update statistics.
    # Compute the mean and variance online using Welford's algorithm.
    count += 1
    min_t = val if val < min_t else min_t
    max_t = val if val > max_t else max_t

    delta = val - mean
    mean = mean + (delta / count)
    new_delta = val - mean
    total_sqr_dist += delta * new_delta

    if count != 0.0:
    variance = total_sqr_dist / (count - 1)
    std_dev = math.sqrt(variance)

    print("min:\t\t%e" % min_t)
    print("max:\t\t%e" % max_t)
    print("jitter (pk-pk):\t%e" % (max_t - min_t))
    print("avg:\t\t%e" % mean)
    print("std dev:\t%e" % std_dev)
    print("count:\t\t%d" % count)


    def main():
    parser = argparse.ArgumentParser()
    parser.add_argument(
    '-f', dest='file_path', default=None, type=str,
    help='Path to input file (e.g. DATA.out) generated by tshark with:\
    tshark -r CAPTURE.pcap -t e -E separator=, -T\
    fields -e frame.number -e frame.time_epoch\
    -e data.data > DATA.out')

    args = parser.parse_args()

    if args.file_path is not None:
    compute_offsets_stats(args.file_path)
    else:
    parser.print_help()


    if __name__ == "__main__":
    main()
    530 changes: 530 additions & 0 deletions udp_tai.c
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,530 @@
    /*
    * This program demonstrates transmission of UDP packets using the
    * system TAI timer.
    *
    * Copyright (C) 2017 linutronix GmbH
    *
    * Large portions taken from the linuxptp stack.
    * Copyright (C) 2011, 2012 Richard Cochran <richardcochran@gmail.com>
    *
    * Some portions taken from the sgd test program.
    * Copyright (C) 2015 linutronix GmbH
    *
    * This program is free software; you can redistribute it and/or modify
    * it under the terms of the GNU General Public License as published by
    * the Free Software Foundation; either version 2 of the License, or
    * (at your option) any later version.
    *
    * This program is distributed in the hope that it will be useful,
    * but WITHOUT ANY WARRANTY; without even the implied warranty of
    * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    * GNU General Public License for more details.
    *
    * You should have received a copy of the GNU General Public License along
    * with this program; if not, write to the Free Software Foundation, Inc.,
    * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
    */
    #define _GNU_SOURCE /*for CPU_SET*/
    #include <arpa/inet.h>
    #include <errno.h>
    #include <fcntl.h>
    #include <ifaddrs.h>
    #include <linux/errqueue.h>
    #include <linux/ethtool.h>
    #include <linux/net_tstamp.h>
    #include <linux/sockios.h>
    #include <net/if.h>
    #include <netinet/in.h>
    #include <poll.h>
    #include <pthread.h>
    #include <sched.h>
    #include <signal.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <sys/ioctl.h>
    #include <sys/socket.h>
    #include <sys/stat.h>
    #include <sys/types.h>
    #include <unistd.h>

    #define ONE_SEC 1000000000ULL
    #define DEFAULT_PERIOD 1000000
    #define DEFAULT_DELAY 500000
    #define DEFAULT_PRIORITY 3
    #define MCAST_IPADDR "239.1.1.1"
    #define UDP_PORT 7788
    #define MARKER 'a'

    #ifndef SO_TXTIME
    #define SO_TXTIME 61
    #define SCM_TXTIME SO_TXTIME
    #endif

    #ifndef SO_EE_CODE_TXTIME_INVALID_PARAM
    #define SO_EE_CODE_TXTIME_INVALID_PARAM 2
    #define SO_EE_CODE_TXTIME_MISSED 3
    #endif

    #define pr_err(s) fprintf(stderr, s "\n")
    #define pr_info(s) fprintf(stdout, s "\n")

    /* The parameter for SO_TXTIME is the below struct. */
    struct sock_txtime {
    clockid_t clockid;
    uint16_t flags;
    };

    static int running = 1, use_so_txtime = 1;
    static int period_nsec = DEFAULT_PERIOD;
    static int waketx_delay = DEFAULT_DELAY;
    static int so_priority = DEFAULT_PRIORITY;
    static int udp_port = UDP_PORT;
    static int use_deadline_mode = 0;
    static int receive_errors = 0;
    static uint64_t base_time = 0;
    static struct in_addr mcast_addr;
    static struct sock_txtime sk_txtime;

    static int mcast_bind(int fd, int index)
    {
    int err;
    struct ip_mreqn req;
    memset(&req, 0, sizeof(req));
    req.imr_ifindex = index;
    err = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF, &req, sizeof(req));
    if (err) {
    pr_err("setsockopt IP_MULTICAST_IF failed: %m");
    return -1;
    }
    return 0;
    }

    static int mcast_join(int fd, int index, const struct sockaddr *grp,
    socklen_t grplen)
    {
    int err, off = 0;
    struct ip_mreqn req;
    struct sockaddr_in *sa = (struct sockaddr_in *) grp;

    memset(&req, 0, sizeof(req));
    memcpy(&req.imr_multiaddr, &sa->sin_addr, sizeof(struct in_addr));
    req.imr_ifindex = index;
    err = setsockopt(fd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &req, sizeof(req));
    if (err) {
    pr_err("setsockopt IP_ADD_MEMBERSHIP failed: %m");
    return -1;
    }
    err = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_LOOP, &off, sizeof(off));
    if (err) {
    pr_err("setsockopt IP_MULTICAST_LOOP failed: %m");
    return -1;
    }
    return 0;
    }

    static void normalize(struct timespec *ts)
    {
    while (ts->tv_nsec > 999999999) {
    ts->tv_sec += 1;
    ts->tv_nsec -= ONE_SEC;
    }

    while (ts->tv_nsec < 0) {
    ts->tv_sec -= 1;
    ts->tv_nsec += ONE_SEC;
    }
    }

    static int sk_interface_index(int fd, const char *name)
    {
    struct ifreq ifreq;
    int err;

    memset(&ifreq, 0, sizeof(ifreq));
    strncpy(ifreq.ifr_name, name, sizeof(ifreq.ifr_name) - 1);
    err = ioctl(fd, SIOCGIFINDEX, &ifreq);
    if (err < 0) {
    pr_err("ioctl SIOCGIFINDEX failed: %m");
    return err;
    }
    return ifreq.ifr_ifindex;
    }

    static int open_socket(const char *name, struct in_addr mc_addr, short port, clockid_t clkid)
    {
    struct sockaddr_in addr;
    int fd, index, on = 1;

    memset(&addr, 0, sizeof(addr));
    addr.sin_family = AF_INET;
    addr.sin_addr.s_addr = htonl(INADDR_ANY);
    addr.sin_port = htons(port);

    fd = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
    if (fd < 0) {
    pr_err("socket failed: %m");
    goto no_socket;
    }
    index = sk_interface_index(fd, name);
    if (index < 0)
    goto no_option;

    if (setsockopt(fd, SOL_SOCKET, SO_PRIORITY, &so_priority, sizeof(so_priority))) {
    pr_err("Couldn't set priority");
    goto no_option;
    }
    if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on))) {
    pr_err("setsockopt SO_REUSEADDR failed: %m");
    goto no_option;
    }
    if (bind(fd, (struct sockaddr *) &addr, sizeof(addr))) {
    pr_err("bind failed: %m");
    goto no_option;
    }
    if (setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE, name, strlen(name))) {
    pr_err("setsockopt SO_BINDTODEVICE failed: %m");
    goto no_option;
    }
    addr.sin_addr = mc_addr;
    if (mcast_join(fd, index, (struct sockaddr *) &addr, sizeof(addr))) {
    pr_err("mcast_join failed");
    goto no_option;
    }
    if (mcast_bind(fd, index)) {
    goto no_option;
    }

    sk_txtime.clockid = clkid;
    sk_txtime.flags = (use_deadline_mode | receive_errors);
    if (use_so_txtime && setsockopt(fd, SOL_SOCKET, SO_TXTIME, &sk_txtime, sizeof(sk_txtime))) {
    pr_err("setsockopt SO_TXTIME failed: %m");
    goto no_option;
    }

    return fd;
    no_option:
    close(fd);
    no_socket:
    return -1;
    }

    static int udp_open(const char *name, clockid_t clkid)
    {
    int fd;

    if (!inet_aton(MCAST_IPADDR, &mcast_addr))
    return -1;

    fd = open_socket(name, mcast_addr, udp_port, clkid);

    return fd;
    }

    static int udp_send(int fd, void *buf, int len, __u64 txtime)
    {
    char control[CMSG_SPACE(sizeof(txtime))] = {};
    struct sockaddr_in sin;
    struct cmsghdr *cmsg;
    struct msghdr msg;
    struct iovec iov;
    ssize_t cnt;

    memset(&sin, 0, sizeof(sin));
    sin.sin_family = AF_INET;
    sin.sin_addr = mcast_addr;
    sin.sin_port = htons(udp_port);

    iov.iov_base = buf;
    iov.iov_len = len;

    memset(&msg, 0, sizeof(msg));
    msg.msg_name = &sin;
    msg.msg_namelen = sizeof(sin);
    msg.msg_iov = &iov;
    msg.msg_iovlen = 1;

    /*
    * We specify the transmission time in the CMSG.
    */
    if (use_so_txtime) {
    msg.msg_control = control;
    msg.msg_controllen = sizeof(control);

    cmsg = CMSG_FIRSTHDR(&msg);
    cmsg->cmsg_level = SOL_SOCKET;
    cmsg->cmsg_type = SCM_TXTIME;
    cmsg->cmsg_len = CMSG_LEN(sizeof(__u64));
    *((__u64 *) CMSG_DATA(cmsg)) = txtime;
    }
    cnt = sendmsg(fd, &msg, 0);
    if (cnt < 1) {
    pr_err("sendmsg failed: %m");
    return cnt;
    }
    return cnt;
    }

    static unsigned char tx_buffer[256];

    static int process_socket_error_queue(int fd)
    {
    uint8_t msg_control[CMSG_SPACE(sizeof(struct sock_extended_err))];
    unsigned char err_buffer[sizeof(tx_buffer)];
    struct sock_extended_err *serr;
    struct cmsghdr *cmsg;
    __u64 tstamp = 0;

    struct iovec iov = {
    .iov_base = err_buffer,
    .iov_len = sizeof(err_buffer)
    };
    struct msghdr msg = {
    .msg_iov = &iov,
    .msg_iovlen = 1,
    .msg_control = msg_control,
    .msg_controllen = sizeof(msg_control)
    };

    if (recvmsg(fd, &msg, MSG_ERRQUEUE) == -1) {
    pr_err("recvmsg failed");
    return -1;
    }

    cmsg = CMSG_FIRSTHDR(&msg);
    while (cmsg != NULL) {
    serr = (void *) CMSG_DATA(cmsg);
    if (serr->ee_origin == SO_EE_ORIGIN_LOCAL) {
    tstamp = ((__u64) serr->ee_data << 32) + serr->ee_info;

    switch(serr->ee_code) {
    case SO_EE_CODE_TXTIME_INVALID_PARAM:
    fprintf(stderr, "packet with tstamp %llu dropped due to invalid params\n", tstamp);
    return 0;
    case SO_EE_CODE_TXTIME_MISSED:
    fprintf(stderr, "packet with tstamp %llu dropped due to missed deadline\n", tstamp);
    return 0;
    default:
    return -1;
    }
    }

    cmsg = CMSG_NXTHDR(&msg, cmsg);
    }

    return 0;
    }

    static int run_nanosleep(clockid_t clkid, int fd)
    {
    struct timespec ts;
    int cnt, err;
    __u64 txtime;
    struct pollfd p_fd = {
    .fd = fd,
    };

    memset(tx_buffer, MARKER, sizeof(tx_buffer));

    /* If no base-time was specified, start one to two seconds in the
    * future.
    */
    if (base_time == 0) {
    clock_gettime(clkid, &ts);
    ts.tv_sec += 1;
    ts.tv_nsec = ONE_SEC - waketx_delay;
    } else {
    ts.tv_sec = base_time / ONE_SEC;
    ts.tv_nsec = (base_time % ONE_SEC) - waketx_delay;
    }

    normalize(&ts);

    txtime = ts.tv_sec * ONE_SEC + ts.tv_nsec;
    txtime += waketx_delay;

    fprintf(stderr, "\ntxtime of 1st packet is: %llu", txtime);

    while (running) {
    memcpy(tx_buffer, &txtime, sizeof(__u64));
    err = clock_nanosleep(clkid, TIMER_ABSTIME, &ts, NULL);
    switch (err) {
    case 0:
    cnt = udp_send(fd, tx_buffer, sizeof(tx_buffer), txtime);
    if (cnt != sizeof(tx_buffer)) {
    pr_err("udp_send failed");
    }
    ts.tv_nsec += period_nsec;
    normalize(&ts);
    txtime += period_nsec;

    /* Check if errors are pending on the error queue. */
    err = poll(&p_fd, 1, 0);
    if (err == 1 && p_fd.revents & POLLERR) {
    if (!process_socket_error_queue(fd))
    return -ECANCELED;
    }

    break;
    case EINTR:
    continue;
    default:
    fprintf(stderr, "clock_nanosleep returned %d: %s",
    err, strerror(err));
    return err;
    }
    }

    return 0;
    }

    static int set_realtime(pthread_t thread, int priority, int cpu)
    {
    cpu_set_t cpuset;
    struct sched_param sp;
    int err, policy;

    int min = sched_get_priority_min(SCHED_FIFO);
    int max = sched_get_priority_max(SCHED_FIFO);

    fprintf(stderr, "min %d max %d\n", min, max);

    if (priority < 0) {
    return 0;
    }

    err = pthread_getschedparam(thread, &policy, &sp);
    if (err) {
    fprintf(stderr, "pthread_getschedparam: %s\n", strerror(err));
    return -1;
    }

    sp.sched_priority = priority;

    err = pthread_setschedparam(thread, SCHED_FIFO, &sp);
    if (err) {
    fprintf(stderr, "pthread_setschedparam: %s\n", strerror(err));
    return -1;
    }

    if (cpu < 0) {
    return 0;
    }
    CPU_ZERO(&cpuset);
    CPU_SET(cpu, &cpuset);
    err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
    if (err) {
    fprintf(stderr, "pthread_setaffinity_np: %s\n", strerror(err));
    return -1;
    }

    return 0;
    }

    static void usage(char *progname)
    {
    fprintf(stderr,
    "\n"
    "usage: %s [options]\n"
    "\n"
    " -c [num] run on CPU 'num'\n"
    " -d [num] delta from wake up to txtime in nanoseconds (default %d)\n"
    " -h prints this message and exits\n"
    " -i [name] use network interface 'name'\n"
    " -p [num] run with RT priorty 'num'\n"
    " -P [num] period in nanoseconds (default %d)\n"
    " -s do not use SO_TXTIME\n"
    " -t [num] set SO_PRIORITY to 'num' (default %d)\n"
    " -D set deadline mode for SO_TXTIME\n"
    " -E enable error reporting on the socket error queue for SO_TXTIME\n"
    " -b [tstamp] txtime of 1st packet as a 64bit [tstamp]. Default: now + ~2seconds\n"
    " -u [port] use udp port 'port'\n"
    "\n",
    progname, DEFAULT_DELAY, DEFAULT_PERIOD, DEFAULT_PRIORITY);
    }

    int main(int argc, char *argv[])
    {
    int c, cpu = -1, err, fd, priority = -1;
    clockid_t clkid = CLOCK_TAI;
    char *iface = NULL, *progname;

    /* Process the command line arguments. */
    progname = strrchr(argv[0], '/');
    progname = progname ? 1 + progname : argv[0];
    while (EOF != (c = getopt(argc, argv, "c:d:hi:p:P:st:DEb:u:"))) {
    switch (c) {
    case 'c':
    cpu = atoi(optarg);
    break;
    case 'd':
    waketx_delay = atoi(optarg);
    break;
    case 'h':
    usage(progname);
    return 0;
    case 'i':
    iface = optarg;
    break;
    case 'p':
    priority = atoi(optarg);
    break;
    case 'P':
    period_nsec = atoi(optarg);
    break;
    case 's':
    use_so_txtime = 0;
    break;
    case 't':
    so_priority = atoi(optarg);
    break;
    case 'D':
    use_deadline_mode = (1 << 0);
    break;
    case 'E':
    receive_errors = (1 << 1);
    break;
    case 'b':
    base_time = atoll(optarg);
    break;
    case 'u':
    udp_port = atoi(optarg);
    break;
    case '?':
    usage(progname);
    return -1;
    }
    }

    if (waketx_delay > 999999999 || waketx_delay < 0) {
    pr_err("Bad wake up to transmission delay.");
    usage(progname);
    return -1;
    }

    if (period_nsec < 1000) {
    pr_err("Bad period.");
    usage(progname);
    return -1;
    }

    if (!iface) {
    pr_err("Need a network interface.");
    usage(progname);
    return -1;
    }

    if (set_realtime(pthread_self(), priority, cpu)) {
    return -1;
    }

    fd = udp_open(iface, clkid);
    if (fd < 0) {
    return -1;
    }

    err = run_nanosleep(clkid, fd);

    close(fd);
    return err;
    }