performance tweaks for receive: added configurable threshold for issuing credit; don't disable byte credit more than necessary; avoided n-squared loop for generating acks