# Advertising, mathematics of

Advertising is one of the most important promotional tools of marketing. Its main purpose is to enhance buyers' responses to the organization and its offerings by providing information and by supplying reasons for preferring a particular organization's offer.

Advertising decisions address an interrelated set of issues, including:

a) how much to spend in total and in the allocation of that budget;

b) how to determine the best messages to deliver with advertising; and

c) what media schedule delivers these messages to the target audience best. Although these decisions clearly interact (for example, a more effective advertising message may permit a lower total advertising budget), researchers have traditionally modeled these phenomena separately.

## Advertising budget setting and allocation.

Models of the size and allocation of the advertising budget vary widely, but most are closely related to the following general form:

Find $u _ { i } ( t )$, $B$, to

Maximize

$$\tag{a1} Z =$$

\begin{equation*} = \sum _ { i } \sum _ { j } \sum _ { t } S _ { i } ( t | \{ u _ { i } ( t ) \} , \{ C _ { i j } ( t ) \} ) m _ { i } - \sum _ { i } \sum _ { t } u _ { i } ( t ) \end{equation*}

Subject to

\begin{equation*} \sum _ { i } \sum _ { t } u _ { i } ( t ) \leq B (\text{budget constraint}), \end{equation*}

\begin{equation*} L _ { i } \leq \sum u _ { i } ( t ) \leq U _ { i } \text{(regional constraint)}, \end{equation*}

where

$S _ { i } ( t | \{ u _ { i } ( t ) \} , \{ C _ { i j } ( t ) \} )$ is the sales in area $i$ at time $t$ as function of current and historical brand and competitive advertising;

$C _ { i j } ( t )$ is the competitive advertising for competitor $j$ in area $i$;

$u _ { i } ( t )$ is the advertising level in area $i$ at time $t$;

$m_{i}$ is the margin per unit sales in area $i$;

$\{ u_i ( t ) \}$ is the entire advertising program;

$U_i$, $L_i$ are the upper and lower regional constraints;

$B$ is the budget. Some researchers have developed a priori models [a15] designed to postulate a general structure. Examples of this approach are the models of H.L. Vidale and H.B. Wolfe [a24], M. Nerlove and K.J. Arrow [a19], J.D.C. Little [a13], [a14], H. Simon [a22], A.K. Basu and R. Batra [a4], F.S. Zufryden [a25], and V. Mahajan and E. Muller [a17]. An alternative econometric approach starts with a specific data base, usually a time series of sales and advertising. These models include those by F. Bass [a2], Bass and D.G. Clarke [a3], D.B. Montgomery and A.J. Silk [a18], J.J. Lambin [a12], A.G. Rao and P.B. Miller [a20], and J.O. Eastlack and A.G. Rao [a7]. [a8] provides a review of the econometric issues in advertising/sales response modeling.

Mahajan and Muller [a17] develop a model of the first type, where their focus is on the optimal shape of $u ( t )$. Specifically, they look at whether advertising programs should be steady or turned on and off (pulsed). They use the following notation:

$a = B / \overline { u } T$ is the proportion of time (out of time $T$) that the firm advertises at level $u$;

$k$ is the number of times the firm switches from advertising at $u$ to zero;

$B$ is the advertising budget. For an even policy, the level of spending is $a \overline{u}$; for the pulsing policies, either $u$ or $0$ is spent. In general, in a $k$-pulsing policy

$$\tag{a2} u = \left\{ \begin{array} { c c } { \overline { u } } & { \text { for } \frac { i T } { k } \leq t < ( i + a ) \frac { T } { k }; } \\ { } & { 0 \leq i \leq k - 1, } \\ { 0 } & { \text { for } ( i + a ) \frac { T } { k } \leq t \leq ( i + 1 ) \frac { T } { k }, } \\ { } & { \text { and for } \ t = T ; 0 \leq i \leq k - 1. } \end{array} \right.$$

To link advertising pulsing to awareness they use the following functional form:

$$\tag{a3} \frac { d A } { d t } = f ( u ) ( 1 - A ) - b A,$$

where

$A$ is the fraction of market aware of the product at any point in time;

$b$ is the decay or forgetting parameter. Here, $f ( u ) ( 1 - A )$ is the "learning effect" and $b A$ is the "forgetting effect" .

The authors show that using any pulsing policy, awareness is

 (a4)

where

$$\tag{a5} x = f ( \overline { u } ).$$

With this model, the authors show that if $f ( u )$ is $s$-shaped, then it generally pays to pulse and that more frequent pulsing is better. If $f ( u )$ is concave, however, an even advertising policy is best.

A different approach to linking advertising to awareness has been proposed by K. Jedidi, J. Eliashberg and W.S. DeSarbo [a10]. They model the system dynamics as:

$$\tag{a6} A ( t ) = [ f ( u ( t ) ) + \beta ( X ( t ) - X ( t - \tau ) ) ] [ N _ { 0 } - A ( t ) ],$$

\begin{equation*} A ( t _ { 0 } ) = A _ { 0 } , \dot { X } ( t ) = [ N ( X ( t ) , A ( t ) , t ) - X ( t ) ] \operatorname { exp } ( - k P ( t ) ), \end{equation*}

\begin{equation*} X ( t _ { 0 } ) = X _ { 0 }. \end{equation*}

For a general income density function $g ( W )$,

$$\tag{a7} N ( X ( t ) , A ( t ) , t ) = A ( t ) \int _ { a ( X ( t ) ) F + b } ^ { \infty } g ( W ) d W.$$

Here, $A ( t )$ denotes the cumulative number of aware consumers, $X ( t )$ the cumulative number of adopters, $N_ 0$ the size of the population of interest, $N ( . )$ the market potential, $u ( t )$ the advertising, and $P ( t )$ the price. The above equations are next incorporated into a dynamic problem and several propositions characterizing the optimal advertising (e.g., monotonically decreasing over time) and pricing (monotonic versus non-monotonic) are derived.

A.G. Rao and Miller [a20] provide an example of the econometric approach, developing a model which combines data from multiple markets over time. Their individual market model is

$$\tag{a8} S _ { t } = c _ { 0 } + c _ { 1 } u _ { t } + c _ { 1 } \lambda u _ { t - 1 } + c _ { 1 } \lambda ^ { 2 } u _ { t - 2 } + \ldots + \mu _ { t },$$

where

$S _ { t }$ is the market share at $t$;

$u _ { t }$ is the advertising spending at $t$;

$c_0$, $c_1$, $\lambda$ are constants ($\lambda < 1$);

$\mu _ { t }$ is the random disturbance. This equation means that an incremental expenditure of one unit of advertising in a given period will yield $c_1$ share points that period, $c _ { 1 } \lambda$ in the following period, $c _ { 1 } \lambda ^ { 2 }$ the period after that, etc.

By multiplying (a8) by $\lambda$, lagging it one period, and then subtracting that equation from the original equation (a8), one obtains:

$$\tag{a9} S _ { t } = c _ { 0 } ( 1 - \lambda ) + \lambda S _ { t - 1 } + c _ { 1 } u _ { t } + \mu _ { t } - \lambda \mu _ { t - 1 } .$$

Note that the short-run effect of advertising here is $d S _ { t } / d u _ { t } = c _ { 1 }$, while the long-run effect is $c_1$ in the first period, plus $\lambda c _ { 1 } + \lambda ^ { 2 } c _ { 1 } + \ldots$ in subsequent periods, or

$$\tag{a10} \frac { c _ { 1 } } { 1 - \lambda }.$$

Now, let

$I$ be the industry sales per year in district;

$P$ be the district population;

$A V$ be the average rate of advertising during the period. Then with $k$ periods per year, a unit increase in advertising produces a share increase of $c_1 / ( 1 - \lambda )$. Thus, the sales increase of an additional unit in advertising is

$$\tag{a11} y _ { i } = \Delta \text { sales } = \left( \frac { c _ { 1 } } { 1 - \lambda } \right) \frac { I } { k } ( \text { in market } i )$$

at a per capita advertising rate of $A V i / P = x_i$. In other words, (a11) can be interpreted as the derivative of a general response curve at the per capita spending rate $A V / P$. These results can then be used across markets to specify the slope of a more general response curve, enabling an optimal allocation of advertising spending.

## Message and copy decisions.

Much of the effect of an advertising exposure depends on the creative quality of the advertising itself. But rating the quality of the advertising is difficult: an advertisement may have good aesthetic properties and win awards, and yet it may not do much for sales. Another advertisement may seem crude and offensive, and yet it may be a major force behind sales.

### Copy testing and measures of copy effectiveness.

See [a11] for a report on the development and testing of a new advertising copy program for AT&T long lines, the "cost of visit" campaign. The cost of the visit campaign was tested against AT&T's very successful "reach out" campaign using a panel of $16,000$ households. Because there is no (necessary) delay between the time an advertisement is shown and when someone can make a call, and because AT&T automatically records the transaction, response to advertising in this setting can be read much more clearly than in other field environments.

The experiment lasted for over two years and had three phases:

1) pre-assessment ($5$ months);

2) treatment period ($15$ months); and

3) post-assessment ($6$ months).

During the pre-assessment phase, records of all households were tracked to establish a norm for their calling behaviour. In addition, all respondents received a questionnaire to determine whether the test and control groups were demographically balanced (they were).

During the treatment period the two advertising campaigns were aired at a rate that gave each household about three exposures per week. The objective of the "cost of visit" campaign was to encourage all user groups, but particularly the light user group, to call during the 60%-off deep discount period (nights and weekends). In the experiment, there was an overall increase in revenue of about 1% overall with the targeted light user group yielding a 15% increase in revenue.

In order to make these assessments and to project them to the national level, they used the following definitions:

USDF is the usage difference between test group (cost of visit) and control group (reach-out);

UNOFF is $0$ for pre-test weeks and $1$ for test weeks;

is the disturbance.

The equation

$$\tag{a12} \operatorname{USDF} = \alpha + \beta \operatorname{UNOFF} + \epsilon$$

models the difference in usage/household/week as a pre-period constant ($\alpha$) and a treatment constant ($\alpha + \beta$). So the statistical significance of $\beta$ for any segment (light users in a deep discount period, for example), can be read from standard confidence limits resulting from linear regression analysis.

In order to project the results to the national level, they used the following model:

$$\tag{a13} y = \sum _ { i = 1 } ^ { I } \left( n _ { i } \sum _ { j = 1 } ^ { J } z _ { i j } p _ { i j } \right),$$

where

$y$ is the projected usage in a given area, assuming a given level of advertising exposure;

$i$ is the index of usage segment (light, regular, etc.), $i = 1 , \ldots , I$;

$j$ is the index of calling category (rate period), $j = 1 , \ldots , J$;

$z_{i j }$ is the usage measure per households in cell $i$ for calling category $j$;

$n_i$ is the number of households of segment type $i$ in the area;

$p _ { ij }$ is the fraction increase or decrease in cell $i$, category $j$ for "cost of visit" versus "reach-out" .

The national or any regional projection can be made by summing over the appropriate areas.

The results of the analysis showed that AT&T could expect to earn more than $100 million more from the segment they targeted without any increase in capital expenditures by introducing this new ad copy. ==='"UNIQ--h-3--QINU"'Estimating the creative quality of advertisements.=== In a study of the effectiveness of industrial print advertisements, D.M. Hanssens and B.A. Weitz [[#References|[a9]]] related$24$ad characteristics to recall, readership, and inquiry generation for$1,160$industrial advertisements in Electronic Design. They used a model of the form $$\tag{a14} y _ { i } = e ^ { a} \prod _ { j = 1 } ^ { p _ { t } } x _ { i j } ^ { b j } \prod _ { j ^ { \prime } = p _ { t + 1 } } ^ { p } ( 1 + x _ { i j ^ { \prime } } ) ^ { b j ^ { \prime } } e ^ { \mu i },$$ where$y _ { i }$is the effectiveness measure for the$i$th advertisement;$x _ { ij }$is the value of the$j$th non-binary characteristic of the$i$th advertisement (page number, advertisement size),$j = 1 , \ldots , p _ { t }$;$x _ { i j^{\prime} }$is the value ($0$or$1$) of the$j ^ { \prime }$th binary characteristic of the$i$th advertisement (bleed, colour, etc.),$j ^ { \prime } = p _ { t + 1} , \ldots , p$;$e ^ { a }$is the scale factor;$\mu _ { i }$is the error term. They segmented$15$product groups into three categories (routine purchase items, unique purchase items, and important purchase items) by factor analysis of purchasing-process similarity ratings obtained from readers of the magazine. They found that advertising characteristics accounted for more than 45% of the variance in the "seen" effectiveness measure, more than 30% of the read-most effectiveness measure, and between 19% and 36% of the variance in inquiry generation. They also found that recall and readership were strongly related to format and layout variables (advertisement size, colours, bleed, use of photographs/illustrations, etc.), while the effects were weaker for inquiry generation. The effects of some factors, such as advertisement size, were consistently related across product groups and effectiveness measures, while others, such as the use of attention-getting methods (woman in advertisements, size of headline, etc.) were specific to the product category and the effectiveness measure. ==='"UNIQ--h-4--QINU"'Advertising copy design.=== Advertising copy design is usually viewed as a non-quantitative, creative process. R.R. Burke, A. Rangsaswamy, Eliashberg, and J. Wind [[#References|[a6]]] developed a rule-based system for advertisement copy design, demonstrating that this view is not altogether true. ADCAD is a rule-based expert system that allows managers to translate their qualitative perception of marketplace behaviour into a basis for deciding on advertising design. The ADCAD system assumes that before purchasing a brand a consumer must 1) have a need that can be satisfied by purchasing this brand; 2) be aware that the brand can satisfy this need; 3) recognize the brand and distinguish it from its close substitutes; and 4) have no other behavioural or attitudinal obstacles to purchasing the brand. Advertising can address one or more of these issues: it can stimulate demand for the product category, create brand awareness, facilitate brand recognition, and modify beliefs about the brand that might be barriers to purchase. ADCAD starts by asking for background information about the product, the nature of competition, the characteristics of the target audience(s), etc., and it then develops a communication strategy for each target audience. Using knowledge base from experts and the published literature, based on artificial intelligence inference engine, ADCAD then selects communications approaches to achieve the advertising and marketing objectives consistent with the characteristics of the consumers, the product, and the environment. It makes recommendations concerning the position of the advertisement, the characteristics of the message, the characteristics of the presenter, and the emotional tone of the advertisement. While ADCAD does not exhibit the creative potential of human copywriters, it does provide important input into the development and assessment of advertising copy. =='"UNIQ--h-5--QINU"'Media selection and scheduling.== Media selection addresses how to find the best way to deliver the desired number of exposures to the target audience and to schedule the delivery of those exposures over the planning period. The effect of exposures on audience awareness depends on the exposures' reach, frequency, and impact: Reach ($R$): The number of different persons or households exposed to a particular media schedule at least once during the specified time period; Frequency ($F$): The number of times within a specified time period that an average person or household is exposed to the message; Impact ($I$): The qualitative value of an exposure through a given medium (thus, a food advertisement would have a higher impact in good housekeeping than it would have in popular mechanics). The weighted number of exposures ($W E$) is the reach times the average frequency times the average impact, that is, $$\tag{a15} W E = R.F.I.$$ The media-planning problem can now be viewed as follows. With a given budget, what is the most cost-effective combination of reach, frequency, and impact to buy? To determine the total weighted-exposure value of a media schedule, one must know two things: the net cumulative audience of each media vehicle as a function of the number of exposures; and the level of audience duplication across all pairs of vehicles. In the case of two media alternatives one would typically have an equation for net coverage as follows: $$\tag{a16} R = r _ { 1 } ( X _ { 1 } ) + r _ { 2 } ( X _ { 2 } ) - r _ { 12 } ( X _ { 12 } ),$$ where$R$is the reach of the media schedule (i.e., total weighted-exposure value with replication and duplication removed);$r _ { i } ( X _ { i } )$is the number of persons in the audience of media$i$;$r _ { 12 } ( X _ { 12 } )$is the number of persons in the audience of both media vehicles. (The$r _ { i } ( X _ { i } )$are typically concave; an old study of the "Saturday Evening Post" showed only 55% more families are reached with$13$issues than with$1$issue.) Equation (a16) can be easily generalized to the case of$n$media alternatives. Modeling approaches to media scheduling have relied heavily on [[Mathematical programming|mathematical programming]] procedures. MEDIAC [[#References|[a16]]], for instance, assumes an advertiser is seeking to buy media for a year with$B$dollars that will maximize sales. He or she can identify$S$different segments of the market, and for each segment$i$(s)he can estimate its sales potential in time period$t$: $$\tag{a17} \overline { Q _ { it } } = n _ { i } q _ { it },$$ where$Q_{it}$is the sales potential of market segment$i$in time period$t$(potential units per time period);$n_i$is the number of people in market segment$i$;$q_{ it}$is the sales potential of person in segment$i$in period$t$(potential units per capita per time period). The sales potential represents the maximum attainable sales in a segment in a given time period. The more dollars spent on advertising in media reaching that segment, the higher the per capita exposure level and the higher the percentage-of-sales potential that will be realized. Thus, the percentage-of-sales potential realized is a function of the per capita exposure level,$f ( y_{i t} )$, where$y_{it}$is the exposure level of an average individual in market segment$i$in time period$t$(exposure value per capita). The problem of finding the best media plan can be stated as trying to: Find the$x _ { j t }$for all$j$and$t$that will Maximize $$\tag{a18} \sum _ { i = 1 } ^ { S } \sum _ { t = 1 } ^ { T } n _ { t } q _ { i t } f ( y _ { i t } )$$ Subject to: current exposure-value constraints $$\tag{a19} y _ { i t } = \alpha y _ { i , t - 1 } + \sum _ { j = 1 } ^ { N } k _ { i j t } e _ { i j } x _ { j t };$$ lower and upper media-use rate constraints, where $$\tag{a20} l _ { j t } \leq x _ { j t } \leq u _ { j t };$$ a budget constraint $$\tag{a21} \sum _ { j = 1 } ^ { N } \sum _ { t = 1 } ^ { T } c _ { j t } x _ { j t } \leq B,$$ where$e_{ij}$is the exposure value of one exposure in media vehicle$j$to a person in market segment$i$;$k_{i j t}$is the expected number of exposures produced in market segment$i$by one insertion in media vehicle$j$in time period$t$;$x _ { j t }$is the number of exposures in media vehicle$j$in time period$t$;$\alpha$is the carry-over effect ($\alpha \in ( 0,1 )$), the amount of exposure in period$t$in the absence of new advertising that period; and non-negativity constraints $$\tag{a22} x _ { j t } , y _ { i t } \geq 0.$$ In this form, the problem has a non-linear but separable objective function that is subject to linear constraints. If the non-linear objective function is concave, the problem can be solved by piecewise-linear approximation techniques. If it is$s\$-shaped and the problem is of modest size, it can be solved by dynamic programming [a16]. If the problem is not of modest size, [a16] shows that satisfactory, though not necessarily optimal solutions can be obtained through the use of heuristic methods.

MEDIAC represents an important attempt to include the dimensions of market segments, sales potentials, diminishing marginal returns, forgetting, and timing into a media-planning model. Other heuristic procedures include those in [a21], [a23], the Solem model [a5], and the ADMOD model [a1].

## Conclusions.

The advertising area has along history of effective mathematical modeling.

Its further development has been limited by historical data limitations. However, new data sources emerging from direct response marketing through the Internet and direct mail, combined with more powerful computational methods are spawning a new set of mathematical models in this area.