Why your xen migration is throwing an error message: Clocksource tsc unstable (delta = 125002555 ns). Enable clocksource failover by adding clocksource_failover kernel parameter.

After migrating a guess to a similar (but not the same CPU) the messages appear in the logs:

    Clocksource tsc unstable (delta = 125002555 ns).  Enable clocksource failover by adding clocksource_failover kernel parameter.

The kernel attempts to keep some kind of consistency in regards to clock migration, the underlying hardware watchdog seems to iterate through all of the available watchdogs/clocks. I don't know if you know how to read code, so I'll attempt to explain what is happening

The kernel implements some kind of clock watchdog to ensure clock sanity for process accounting and that the clock behavior is generally consistent.

My comments relevant to the discussion are in these style delimiters */ */ , and this is a modified version of the source code cut back for easier understanding.

/* From: kernel/time/clocksource.c */

    /* We iterate through each of the watchdogs */
    list_for_each_entry(cs, &watchdog_list, wd_list) {

            /* Clocksource already marked unstable we skip it. */
            if (cs->flags & CLOCK_SOURCE_UNSTABLE) {
                    if (finished_booting)
                            schedule_work(&watchdog_work);
                    continue;
            }

            /* Calculate the watchdog nano seconds */
            wd_nsec = clocksource_cyc2ns((wdnow - cs->wd_last) & watchdog->mask,
                                         watchdog->mult, watchdog->shift);

            /* calculate the current clocksource ns */
            cs_nsec = clocksource_cyc2ns((csnow - cs->cs_last) &
                                         cs->mask, cs->mult, cs->shift);

            /* Check the deviation from the watchdog clocksource. */
            if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
                    if (clocksource_failover)
                            clocksource_unstable(cs, cs_nsec - wd_nsec);
                    else
                            printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns).  Enable clocksource failover by adding clocksource_failover kernel parameter.\n",
                                   cs->name, cs_nsec - wd_nsec);
                    continue;
            }
    }

This is the loop that generates the message that you're seeing in the messages file.

In this source we see that it iterates through each of the clocks in the watchdog list,

If the clocks flags are marked as 'unstable' it jumps to the next iteration of this loop.

If not, it goes on to calculate the sanity and stability of the timesource, then goes back to the top of the loop.

I believe that the first time through this loop, it is testing the "xen" clocksource, the second time through it is testing the "tsc" clock source.

There is a condtional option that checks for a kernel parameter (clocksource_failover) which would allow for a clocksource to be marked unstable if it is above the threshhold for allowed error.

The clocksource_unstable code is as follows:

    static void clocksource_unstable(struct clocksource *cs, int64_t delta)
    {
            printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
                   cs->name, delta);
            __clocksource_unstable(cs);
    }

Which pretty much directly calls:

    static void __clocksource_unstable(struct clocksource *cs)
    {
            cs->flags &= ~(CLOCK_SOURCE_VALID_FOR_HRES | CLOCK_SOURCE_WATCHDOG);
            cs->flags |= CLOCK_SOURCE_UNSTABLE; /* This being set
            if (finished_booting)
                    schedule_work(&watchdog_work);
    }

Setting the clock to be invalid for high res and as a watchdog, also setting the clock_source to be considered unstable, this should mean that the message should only appear once, assuming the kernel parameter clocksource_failover is set.

So i'd expect to see something like this in the logs when clocksource_failover kernel parameter is set:

    kernel: Clocksource tsc unstable (delta = 10101001 ns)

The clocksource to be marked unstable (once), and then the system to continue normally, the message should not repeat many times.

Reproducing this issue will likely require a migration from the same source and destination hosts that the situation was previously observed with.

So in short.

Try the clocksource_failover kernel parameter on the guest.
Migrate it from the hosts that caused the issue previously.
Some applications may attempt to read the TSC from the CPU directly (using the rdtsc assembly instruction) so it might be best to leave TSC as a timesource enabled.

wmealing/gist:5316542