Can machine automatically generate multiple distinct and natural hand grasps, given specific contact region of an object in 3D? This motivates us to consider a novel task of Region Controllable Hand Grasp Generation (RegionGrasp), as follows: given as input a 3D object, together with its specific surface area selected as the intended contact region, to generate a diverse set of plausible hand grasps of the object, where the thumb finger tip touches the object surface on the contact region. To address this task, RegionGrasp-CVAE is proposed, which consists of two main parts. First, to enable contact region-awareness, we propose ConditionNet as the condition encoder that includes in it a transformer-backboned object encoder, O-Enc; a pretraining strategy is adopted by O-Enc, where the point patches of object surface are ran- domly masked off and subsequently restored, to further capture surface geometric information of the object. Second, to realize interaction aware- ness, HOINet is introduced to encode hand-object interaction features by entangling high-level hand features with embedded object features through geometric-aware multi-head cross attention. Empirical evalua- tions demonstrate the effectiveness of our approach qualitatively and quantitatively where it is shown to compare favorably with respect to the state of the art methods.