Thursday, February 3, 2022

Customer focus powered by SBT and Automation

Recently we experimented with Session Based Testing (as opposed to scripted testing) and reflected

about how to integrate it in our processes.

We asked questions like

  • What makes human testers unique?
  • Is there space for product validation in regression testing?
  • Can we reduce test documentation efforts?

  • This was a collaborative effort. I am very grateful for everybody who gave feedback, joined testing sessions or participated in some other way to help us learn and improve our Testing.

    Hopefully, others will join the conversation and maybe find something useful in our findings.

    You can download the slides with speaker notes as PDF here.

    Thursday, July 8, 2021

    Testing shared memory communications with Linux on Z

    Mainframes allow for shared memory communications between LPARs on the same box through ISM - internal shared memory. More information about the Linux device driver can be found here.

    IBM provides open-source tools including smc_run which easily converts an application's usage of TCP/IP sockets to SMC (shared memory connection) sockets.

    Let's suppose you want to check two of your LPARs can communicate with each other.

    First you'd want to check if ISM is available. As ISM are made available as virtual pci devices, you can simply run lspci on each LPAR.

    # lspci
    00:00.0 Non-VGA unclassified device: IBM Internal Shared Memory (ISM) virtual PCI device
    If your admin tells you have been provided the ISM device but you don't see it you might have to power it on.
    # echo 1 > /sys/bus/pci/slots/0000032/power
    Now, smc_run will convert any TCP/IP socket usage to an SMC socket. So let's suppose you have an echo server using the AF_NET protocol that you can start via commandline; you'd simply run the same command.
    [host1]# smc_run python3 echo_server.py --host my_host_name.example.org --port 12345
    You can then simply send data from a client in the same way.
    [host2]# smc_run python3 send_data.py --host my_host_name.example.org --port 12345
    some data
    Looks like it's working right? But does it really?

    If you power off one of the ISM devices, the above scenario still works. How can that be?

    Reading through the manpages, we'll find in the af_smc manpage:

    SMC socket capabilities are negotiated at connection setup. If one peer is not SMC capable, further socket processing falls back to TCP usage automatically.

    So, how can we make sure that our LPARs really communicate through the ISM?

    The s390-tools luckily deliver another tool smcss. It shows details for AF_SMC socket connections. The Mode column shows how data is exchanged:

    SMCD     The SMC socket uses SMC-D for data exchange.

    SMCR     The SMC socket uses SMC-R for data exchange.

    TCP        The SMC socket uses the TCP protocol for data exchange, because an SMC connection could not be established.

    And really, the difference can be confirmed while the connection is open depending on the availability of ISM on both LPARs.

    [host1]# smcss
    State   UID   Inode   Local Address       Peer Address        Intf Mode
    ACTIVE  00000 22045079 192.168.0.10:12223  192.168.0.12:37060  0000 SMCD
    
    vs.
    [host1]# smcss
    State   UID   Inode   Local Address       Peer Address        Intf Mode
    ACTIVE  00000 22049662 192.168.0.10:12223 192.168.0.12:37058  0000 TCP 0x05000000/0x03030000
    

    Finally, the s390-tools since version 1.5 also offers another tool that helps to check the ISM live-connectivity without a TCP application, smc_chk. You can shortly run:

    [host1]# smc_chk -S
    Server started on port 37374
    [host2]# smc_chk -C 192.168.0.12 -p 37374
    Test with target IP 192.168.0.12 and port 37374
      Live test (SMC-D and SMC-R, EXPERIMENTAL)
         Success, using SMC-D

    Friday, September 4, 2020

    How to align on the right - partition alignment algorithm

     In MSDOS 6.22 there are alignment restrictions for partitions. This means a partition of size capacity = /start - end/ = end - start, partition boundaries (start, end) must coincide with certain boundaries; in this case cylinder boundaries.



    The following alignment algorithm is taken from the libvirt virtualization API.

    In short, the algorithm will make sure allocated continuous space is aligned on the right, that is on the end. If the available free space for alignment already starts at a given boundary value, it will be fully aligned [1].

    We'll have:

    1. Input: c := capacity, l := alignment interval, s := start
    2. Output: e := end
    All of these values are in ZZ (actually NN). For e: c <= e - s; e + 1 = n*l (for some natural n), that is, the required capacity fits into the allocated space and the end is aligned while it must end one unit before the next interval starts.

    Let r := l - (c mod l). We understand as the extra space required to reach the interval boundary, e.g. if l is 512 (think of sector size) and I need to allocate capacity 618, then 1*l won't cover c, instead I'd have to used 2*l = 1024 >= 618. But then I have 1024 - 618 = 406 = 512 - (618 mod 512) of extra space I need to allocate that wasn't really required.

    The algorithm handles three cases:
    1. s = m*l, for some m (the start is aligned at a boundary)
    2. s != m*l; s mod l <= r (the start offset fits into the extra space reserved for alignment)
    3. s != m*l; s mod l > r (the start offset doesn't fit) 
    For 1. the correct e is quite easy, we already know how much extra space to align and subtract 1 to have the partition end just before the next boundary in order to have the next partition start exactly at boundary.

    (1)    e = s + c + r - 1

    This is the base for the other two cases.

    For 2. (1) would surpass the boundary:

    boundary=s       ...         s+c          boundary=s+c+r
             |                ...           |                        |

    boundary          s         ...       s+c   boundary       s+c+r
             |               |                      |                |              |

    But we know that s mod l <= r, therefore s+c+r - s mod l >= s + c proving that the alignment on the right would fit the required capacity. Thus:

    (2)    e = s + c + r - s mod l - 1

    And e is still on boundary: e = s + c + r - s mod l - 1 =  (c + r) + (s - s mod l) - 1 = n_1*l + n_2*l - 1.

    Now for 3. from the above,   e - s = s + c + r - s mod l - s = c + r - s mod l < c. If we originally had defined r to be 2*l - (c mod l), then e - s > c. But we didn't do that because for 1. and 2. that would be a waste of space. However, here for 3. we don't have another choice, so we add another l:

    (3)    e = s + c + r + l - s mod l - 1

    In the referenced algorithm, you'll see that s mod l is always subtracted. Let's keep in mind that s is at a boundary iff s mod l = 0. So we can actually summarize

                     
    (4)    e = s + c + r + d - s mod l - 1, where d := 0 if s mod l < r, else d := l.
    QED.

    [1]  I wonder if the need for a first partition not starting exactly at the second cylinder to save space or some other MSDOS restrictions are the reason for not aligning the partition start, too.

    Monday, July 27, 2020

    Set up Crypto Card passthrough with KVM on IBM Z (vfio-ap)

    Crypto Cards on IBM Z systems provide secure key encryption.

    This security feature can be passed through to KVM guests.

    The passthrough is available through the vfio_ap kernel module (paired with the homonymous driver). It uses another passthrough interface, namely, the VFIO mediated device framework (represented by kernel module vfio_mdev).

    More details can be found kernel doc. Here, the focus is on setting up a single passthrough using libvirt.

    What we need
    1. A System Z host with a crypto card, KVM guest
    2. lszcrypt command (from s390tools, often comes preinstalled with distro, package name can be s390utils, too)
    What we do
    1. Identify the device
    2. Mark device queues as not usable by host
    3. Create mediated device
    4. Assign crypto device to mediated device
    5. Attach mediated device to guest
    6. Verify setup
    Identify the device

    # lszcrypt -V

    CARD.DOMAIN TYPE  MODE        STATUS  REQUESTS  PENDING HWTYPE QDEPTH FUNCTIONS  DRIVER     
    --------------------------------------------------------------------------------------------
    01          CEX5C CCA-Coproc  online         1        0     11     08 S--D--N--  cex4card   
    01.0011     CEX5C CCA-Coproc  online         1        0     11     08 S--D--N--  cex4queue  

    We need two pieces of information:
    • HWTYPE: passthrough is only supported if this number is >= 10
    • CARD.DOMAIN: 0x01 (adapter id), 0x0011 (domain id)
    Mark device queues as not usable by host
    • Mark adapter not usable by host:
      • echo -0x01 > /sys/bus/ap/apmask
    • Mark device queues not usable by host:
      • echo -0x0011 > /sys/bus/ap/aqmask
    lszcrypt now should only list the card, not the queue.

    # lszcrypt
    CARD.DOMAIN TYPE  MODE        STATUS  REQUESTS
    ----------------------------------------------
    01          CEX5C CCA-Coproc  online         4



    Create mediated device

    cd /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough
    uuidgen > create

    Assign crypto device to mediated device

    Below the device dir $uuid that we just created below vfio_ap-passthrough/devices:

    cd devices/$uuidgen
    echo 0x01 > assign_adapter
    echo 0x0011 > assign_domain

    We can confirm assignment:

    # lszcrypt -V
    CARD.DOMAIN TYPE  MODE        STATUS  REQUESTS  PENDING HWTYPE QDEPTH FUNCTIONS  DRIVER     
    --------------------------------------------------------------------------------------------
    01          CEX5C CCA-Coproc  online         4        0     11     08 S--D--N--  cex4card   
    01.0011     CEX5C CCA-Coproc  -              4        0     11     08 S--D--N--  vfio_ap    

    Attach the mediated device to guest

    Use $uuid for the mediated device and modify guest domain xml (e.g. virsh edit)
    <hostdev mode='subsystem' type='mdev' model='vfio-ap'>
        <source>
          <address uuid='$uuid'/>
        </source>
    </hostdev>


    Verify setup

    After starting the guest:

    root@guest # lszcrypt -V

    CARD.DOMAIN TYPE  MODE        STATUS  REQUESTS  PENDING HWTYPE QDEPTH FUNCTIONS  DRIVER     
    --------------------------------------------------------------------------------------------
    01          CEX5C CCA-Coproc  online         1        0     11     08 S--D--N--  cex4card   
    01.0011     CEX5C CCA-Coproc  online         1        0     11     08 S--D--N--  cex4queue  

    Tuesday, April 28, 2020

    The testing effort growth

    Just wondering how a proof for "Number of test cases grows exponentially" could look like...

    Define a program to be a function from a set of input variables to a set of output variables,

    P = Input x P|Output = I x O 
      = { (j_1,...,j_n) x (o_1,...,o_m) }
      = { (j_1,...,j_n,o_1,...,o_m) }.

    Each component of Input is supposed to have at least 2 elements. (If not, the input variable will never change the image value, in other words, the program's behavior, and can therefore be eliminated. The case where an input value is determine to be defined or not, Input ≃ { 0, {0} }.)

    Also, I need to restrict to the case where the dim(I) > 1 because if not each extra value to test adds exactly one test case (linear growth).

    Define a feature to be a selection of a subset of input variables combined with their image, that is, it's a restriction of the program to a subset of the domain, F = P|J, J < I.

    We take a test case to be the a random variable

    T : Input x Ω -> Input x Output
    T(i, ω) = (i, T_1(i, ω),..., T_m(i, ω)).

    Each T_j is an expected output or simply expectation or post-condition.

    A test passes if T_j(ω) = P(i)_j for all j, or fails otherwise, for a test execution ω.

    A test case of a feature is then simply T|F = T|J x Ω where F = P|J.

    For a given feature we can add new behavior minimally by
    1. extending the domain of an input variable J_i by an additional value j' that defines a new behavior of the program, that is for some i, J_i' = J_i + { j' } and Input = J_1 x ... x J_i' x ... x J_n
    2. extending the set of input variables by an additional dimension J_n+1
    Both will result in an extended feature F'.

    For 2. it is easy to see that

    #T|F' = #{ (J_1,...,J_n) x J_n+1 x O } 
          = #J * #J_n+1 * # O 
          = #T|F * #J_n+1
          ≥ #T|F * 2.

    As for 1., given that dim(J) > 1 each new value j' creates another full set of combinations of input variables (j_1,...,j',...,j_n), that is, the number of added test cases is

    #T|F' - #T|F = #J_1,...,^J_i,...,J_n
                                   ≥ 2^(n-1)

    where ^J_i denotes not selecting this component.

    Therefore,

    #T|F'  #T|F + 2^(n-1).

    So, in general adding a new value to test, the lower bound for growth is 2^(n-1).  ⃞

    Considering that the effort E|F of testing a feature depends on the selected test suite S|F < T|F, we might dare say that selecting S|F and the methods to evaluate T(i, ω) is a very important activity in testing.

    I hope this makes sense...

    Thursday, March 26, 2020

    ssh into libvirt guest

    Libvirt sets up NAT mode per default allowing guests to communicate, for example, with the internet, with each other and with the host. A host interface vnetX will be created and normally an IP will be assigned automatically.


    # virsh domifaddr vm
     Name       MAC address          Protocol     Address
    -------------------------------------------------------------------------------
     vnet0      52:54:00:6f:dc:90    ipv4         192.168.122.84/24
    
    # virsh dumpxml --inactive vm
    ...
        <interface type="network">
          <mac address="52:54:00:6f:dc:90">
          <source network="default"></source>
          <model type="virtio">
          <address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci">
        </address>
    ...
    

    However, vnetX won't get the IP assigned really, that's not how it's intended. So, per default neither the ip nor any name are available to ssh

    What you can do instead is use the NSS module libvirt-nss.

    If you want to easily ssh into the guest using its libvirt name, set
    # /etc/nsswitch.conf:
    hosts:       files libvirt_guest dns
    
    and make sure to have the correct sshd configuration in your guest.

    Tuesday, February 21, 2017

    Sunk Cost Fallacy - Let's talk about bias 3

    We continue our review of  biases we all have intrinsecally implemented in our own brains that lead us to make certain decisions, as Rolf Dobelli describes in his bestseller The Art of Thinking Clearly.

    Sunk Cost Fallacy - Are you Lean enough?

    This really is a very useful bias. I believe if we're honest enough with ourselves it helps us to really do a better job saving unnecessary costs for our company.

    Without much introduction, two cases that immediately come to my mind:

    1. You see this huge beautiful implementation arriving testing. Verification passes without any flaw. You then notice: validation fails! This is not really what we wanted. Do you dare to create a bug and vote for throwing away / reworking all that hard and good work? Do you try to bend in and persuade any opposing mind to believe that that's what they wanted? Sometimes this is not the worst choice - sometimes clients just don't know what they want XD
    2. You've convinced everybody you need those automated tests / manual test scripts / test management system - whatever - to improve your testing process and all over quality of your product(s). It becomes a nuissance in time, things change, tempora mutantur. Are you willing to let go, to not hold it back anymore? Throw away what lacks purpose and be really lean.