As a side (let the hate come), I think the word "functional" is stupid. Obviously $\int_a^b f \; dx$ is going to depend on the $f$ we choose. Adding the word "functional" seems confusing, and pointless. So don't let it confuse you. And what's the difference between a "functional" and a composition of functions? (and here comes, also, all the "fine details" to explain the difference). And what's the difference between a "functional" and a "normal, common, every day function" $f(x)$ where you think of all the $x$'s as constant functions, and hence a "functional"? But I'm interested in reading the replies to this question and learning something (and maybe you can also learn from me - I don't like pointless words - useless categories; make learning easy - and know what you're talking about)
OK I'm beginning to see - let's consider the notation $f(g(x))$. How should I read this? To most people this is a composition of 2 functions $f$ and $g$. $f$ has it's domain $D_f$ and range $R_f$. And $g$ has it's own domain $D_g$ and range $R_g$. (I don't like the word "codomain" - let the codomain be the range - I just want the set of inputs, and the set of outputs, that is, the range). The composition is written $f \circ g$; or if we are labeling the domain elements of $g$ as $x \in D_g$, then as $f(g(x))$. Now, just for clarity, the composition is neither $f$ nor $g$ but it's own function itself (which we see under the hood, to be two functions made up of $f$ and $g$, but itself is it's own function - let me explain) A function is defined by it's 1) domain elements, 2) range elements, and 3) the mapping or linkage between the two. In the composition above, it's domain elements are the $x$'s of $g$, but it doesn't have to be the full $D_g$. Call the composition $h := f \circ g$, or, $h(x) := f(g(x))$ just to indicate that it's a single variable function, and you know what you are talking about.
(Function composition) is very important to understand when you learn multivariable calculus. Consider a $3$-variable function $f : (u,v,w) \mapsto f(u,v,w)$. Through function composition, $a: t \mapsto a(t)$, $b: t \mapsto b(t)$, $c: t \mapsto c(t)$...the composition $f(a(t), b(t), c(t))$ is a single variable function of $t$ while $f$ was multivariable (and it's name is not $f$...you already defined $f$ to be what it is..but many people will call it $f$ wrongly...to the beginner prove that you know the difference between $f$ and the composition. But if you know what you are doing, I'll let it slide). You can only take the derivative of this new function with respect to its single variable $t$ - thus in the chain rule, you'll see on the left hand side a $d/dt$; but on the right you'll see a $\partial/ \partial u$, a $\partial/ \partial v$, a $\partial/\partial w$ and three $d/dt$'s. And if the inner functions were variables not of $t$, but of $t$ and $z$, you'd now see in the chain rule a $\partial/ \partial t$ on the left (applied not to $f$, but to the composition - remember they are different functions and hence have different names), and 3 $\partial/\partial t$'s on the right along with the $\partial/ \partial u$, $\partial/ \partial v$, $\partial/\partial w$ applied to $f$ (assuming you were taking the partial derivative of the composition with respect to $t$; the only other option would be $z$ as it's a $2$-variable function).
When it comes to the word "functional" - this involves interpreting $f(g(x))$ as something completely different. In fact, if some one were to write down $f(g(x))$ and call this a "functional" I think that they are abusing the notation of what $f(g(x))$ actually means - the above paragraphs. If $f$ is a "functional", then "moving around in the domain" is not as we usually think of it (and we need proper notation). It's easy to move around on a 1-dimensional number line. To represent a pair of numbers, we form a gird, and move around in particular directions. Hence, rates of change, or derivatives require a direction. If $f$ is a functional, then $x$ isn't that important. It's the shape of $g$ (not a number, but the shape) - but $g$'s shape can be varied in an infinite number of ways...so wouldn't the derivative of $f$ "at the starting shape of" $g$ depend on how we vary its shape? What if we don't vary $g$ in the segment $[-\infty,1)$ but add a squaring function to it in the segment $[1,\infty]$? Then let this squaring function tend to $0$? But what if we change the squaring function to something else? Then wouldn't the "functional derivative" depend on how we choose to reshape $g$?