Reality and Representation (Bordwell)

When I say  "blue" everyone can represent that very color in their head... but do we really picture the same color for this particular wavelength of light? Maybe some people see the sky and everything called "blue" like what I see everything called "yellow" without knowing the internal discrepancy because we are consistent in calling the color-a-like objects by a consensual name. Like in the case of color blindness (which is a disorder we can detect).

Human perception is limited by the very nature of our brains, because that's what we use to comprehend. But this is high-end theory. Cinema doesn't make use of such refined aspects of the perceptive process. In Plato's Cave, cinema is that 2-dimensional  monochrome shadow on the wall, and real life human perception is what happens in the cave behind the fire, invisible to the prisoner-observer. Cognitive science strives to explain the universe of daylight outside the cave, which is of metaphysical interest, and do not really affect the understanding of cinema one way or another. The science of perception is infra-language : it comes ahead of any possible communication. A fully functioning eye-ear-brains apparatus enables the idea of language formation, perception and comprehension. Visual language comes afterward, developed by society with whatever the human brains could handle.

So says Bordwell :"But we don’t have to worry about whether it’s true; what matters is that filmmakers invoke it and film viewers follow their lead. Storytellers are practical psychologists, preying (usually in a good sense) on our habits of mind in order to produce experiences." after making numerous cases of such studies.

Cinema is made by flawed humans and watched by equally flawed humans. That's all film theory needs to know. If a camera could record images that we are unable to perceive or understand (eg. infra red or high-speed flickering), filmmakers wouldn't use them because of their inefficiency on the spectator experience.

Bordwell: "Film academics assume, along with most humanists, that once you set aside some uninteresting aspects of the human creature, usually summed up as “physiology,” culture goes all the way down. Beyond cell division and digestion, let’s say, everything is cultural, and to invoke any other explanations risks rejection."

Bordwell wrote an essay on "commonsense film theory" (there is at least a couple academics who could learn why "common sense" is not undesirable in film criticism!). 
I wonder why he uses the likeness of images rather than their editing to interrogate the perception of spectators. The audience reacts to screen images just about the same way they react to reality. Only editing is really specific to film language and could eventually require a different process than in the real world.
The real world we live in is full of situations where mankind has evolved to perceive, interpret and infer incomplete images, like in the movies. Humans are perfectly able to understand a reflection on a mirroring surface, a shadow on a wall, a silhouette in the fog, a sound without image, a view without sound, an animal through blocking leaves, a visage behind a glass window or a translucent curtain... In this regard, cinema didn't pose a major problem to our perception that we didn't have to solve before.

Even the absence of stereoscopy (perception of depth), merely equates to looking through a keyhole in real life. We are not disconcerted by this situation. It's harder to make sense instantly of a perspective, but we manage. So a video image of a person is not so different from looking at someone in real life, behind a window, through a keyhole, with a peculiar texture on the glass that alters the resolution of the image (like the typical video resolution does). 

You are sitting on a bench in a park, in real life, and you eavesdrop on a couple next to you having an argument. You don't know how it started, what succession of events led to this conflict, what is their past, their life, their background. But you can try to follow what is going on anyway, interpret the body language and the portions of sentences you catch, make your own speculations and infer what it is all about, little by little. They walk inside a building near enough to continue your observation, and you only see them and hear them each time they pass an open window. This is no different from what audience are subjected to on a cinema screen, and we are well equipped to fill in most of the blanks, and at least get a basic understanding of the action, even when the distillation of bits of informations and visual cues, the timing of their obstruction wasn't designed by a narrator.

I'm not saying that cinema narration is an innate language, but as far as cognitive process is concerned, it's not the imperfect nature of screen images or the timeline editing that could confuse spectators more than in real life. Not to mention the mental processes mankind used to develop and master for the transmission of oral traditions, such as memorization of a continuous timeline, at the scale of a day or a lifetime, or village history, and the manipulation of a discontinuous timeline to construct dramatised stories or lies. Cinema came after cave paintings, music, poetry, theatre and literature... and built a language largely based on previously acquired skills of perception, mental visualisation, communication and memory.

"Recognizing the contents of realistic images, I’ve suggested, depends heavily upon our everyday perceptual abilities. Similarly, filmic storytelling relies upon cognitive dispositions and habits we’ve developed in a real-world context." 
I wish the examples of "cinema narration" were a little bit more medium-specific than what he cites to question our understanding of cinema conventions : a man running away from someone (easily identifiable in real life without any more context than on a screen), a red-tinted sexy scene with saxophone (real life cliché), Daffy Duck (similar to identifying mural stick figures or a sock puppet)...
In these cases, cognitive science only indicates how humans generally perceive real world situations, it doesn't explain what is specific to film language, or what poses a problem to usual cognition in filmic representation...
I wish "cognitive science based film studies" would look into problematics that are less about plotpoint stereotypes (reading emotions, intentions, causality, continuity, space orientation...) and more about the subtle quintessence of film language (frame composition, harmony, sense of duration, pauses, images collision, rhythmic edit, organic crowds, permanent landscapes...) because filmmakers need to learn to use richer cues, more elaborate, more indirect, delayed in time, secondary cues that are not operative in the plot advancement, cues for transient impressions that only qualify a shot or a moment.

"Further, Narrative in the Fiction Film argued that the conventions that guide our inferential extrapolation don’t simply float free in space. There were recurring clusters of favored choices for presenting causality, time, and space. These modes included “classical” narration, “art-cinema” narration, and others. The historical layout still seems valid to me, and they seem to have proven useful to other researchers."
I'd like to come back to this nebulous label of "art cinema" later, because I don't think it represents anything identifiable by a standard form like "classical narration" or "Hollywood" (by period) or "Mainstream" could. As for "Avant Garde", the concept of "art cinema" doesn't correspond to a definite format, or to any films in particular, except to cast them outside of the well known classicism. Mentioning "classical narration" refers to a set of codes and conventions that will match almost every film we file under this umbrella term. However, what you call "art-films" are practically all different from each other (except for certain trends and "schools"), aside from being also different from the classical format. 

"Considering narrative comprehension as inferential led me to bring in the Russian Formalist distinction between fabula and syuzhet. These two terms have been used in several ways, but the most plausible way, it seemed to me then [Narrative in the Fiction Film, 1985] and seems still, is to see fabula as the chronological-causal string of events that may be presented by the syuzhet, the configuration of events in the narrative text as we have it."
Serge Daney was talking about "énoncé" (=fabula) and "énonciation" (=syuzhet) in 1974 (Cahiers du cinéma, n°248-249-250, janvier-mai 1974).

"But I now think that the inference-making takes place in a very narrow window of time, and it leaves few tangible traces. What is built up in our memory as we move through a film is something more approximate, more idiosyncratic, more distorted by strong moments, and more subject to error than the fabula that the analyst can draw up. [..] In assigning to the spectator the task of ongoing fabula construction, NiFF harmonized with one premise I consider central: a holistic sense of form. Even if we scan the entire narrative through a narrow slit, it’s important for the analyst and theorist to consider the overall design of the work, the more or less coherent principles that govern the unfolding tale. I’m thinking of such matters as smoothly cascading character goals, psychological motives and personality change, gradual development of knowledge, shifts in viewpoint, repeated and varied motifs, and finer-grained patterns of visual and sonic presentation."
That is an interesting aspect of human perception applied specifically to the cinema experience and how the limitations of this experience can be exploited, manipulated by the narrative design. 
Even if we do not recall every element of a film by the end of the screening, or even during the film, our subconscious does. Often the film needs to use self-quoting flashback (of images already previously projected in the distant beginning of the film), to make a pointed reference, as if the audience had already forgotten what they saw a few minutes ago. And the emotional climate within which each scene or image is registered (in state of fear, or under the tone of suspicion, or in awe, or accompanied by a music cue...) attribute to each memorial image a distinct accent, that influences the way we store and retrieve such memory, in the long term memory (belonging to our personal history), or in the short term memory (for single-using within the duration of the film).

