I briefly skimmed the paper; it looks like they're using pwm but not at its full potential. I would use it also as a synchronization mean, that is, the attacker points the led/laser and receiver to the target led, the attacker sends a signal like say a 10% modulated pwm, save for a 50% wide start bit which marks the start of the word being transmitted, then the bits are modulated like 10% for 1 and 20% for 0, or the other way around. Basically, the attacker talks 20% of each cycle, and listens the remaining 80%.
The target led can be then read to detect those signals and sync itself to the signal received so that when replying it just modulates the led during the remaining time of each duty cycle. The attacker just by maintaining the link will receive both the echo of its transmission and the target's reply.
That's just an idea, however, I'm not implying I could be able to implement it effectively:).