We propose a computational cognitive Extended Visual Memory (EVM) model for a Computer-Aided Vision (CAV) framework to assist human in vision-related tasks. The CAV framework exploits wearable sensors such as cameras, GPS and ambient computing facilities to empower a users vision and memory functions by answering four types of queries central to visual activities. Learning of EVM relies on both frequency-based and attention-driven mechanisms to store view-based visual fragments (VF), which are abstracted into high-level visual schemas (VS), both in the visual long-term memory. During inference, the visual short-term memory plays a key role in the schematic representations of, and the similarity computation between, a visual input and a VF, exemplified from VS when necessary. In this paper, we describe the CAV framework and the new EVM model followed by an implementation scenario on assisted living.