Cognitive radios are the new paradigm to improve the current spectrum utilization. In this work, we discuss ``Duplex-observe-learn-and act'' (DOLA) a primary channel access control algorithm that constitutes the first alternative to the traditional opportunistic spectrum access for cognitive radios ``Listen-Before-Talk'' (LBT). DOLA seeks to exploit the bi-directional nature of many primary communication systems to enable opportunistic secondary access while achieving primary receiver protection. Here, we focus our attention on ACK/NAK feedback notifications overhearing from the primary receiver (PU-Rx) and we show that primary protection can be achieved in terms of probability of collision, that can be kept below $6\%$ fine tuning the secondary users' reward function.