This project investigates decentralized user policy for ultra-reliable low-latency communications (URLLC). The main focus is grant-free multiple access (GFMA), during which the users start data transmission without a grant. Since users are uncoordinated during GFMA, the network performance can be severely degraded due to pilot collision and interference. Our interest is to explore machine learning algorithms, especially multi-agent reinforcement learning, to develop decentralized policy, such that the users can make access decisions cooperatively by only using local information.