Joint Optimization of Privacy and Cost of in-App Mobile User Profiling and Targeted Ads

Online mobile advertising ecosystems provide advertising and analytics services that collect, aggregate, process, and trade a rich amount of consumers' personal data and carry out interest-based ad targeting, which raised serious privacy risks and growing trends of users feeling uncomfortable while using the internet services. In this paper, we address users' privacy concerns by developing an optimal dynamic optimisation cost-effective framework for preserving user privacy for profiling, ads-based inferencing, temporal apps usage behavioral patterns, and interest-based ad targeting. A major challenge in solving this dynamic model is the lack of knowledge of time-varying updates during the profiling process. We formulate a mixed-integer optimisation problem and develop an equivalent problem to show that the proposed algorithm does not require knowledge of time-varying updates in user behavior. Following, we develop an online control algorithm to solve the equivalent problem and overcome the difficulty of solving nonlinear programming by decomposing it into various cases and to achieve a trade-off between user privacy, cost, and targeted ads. We carry out extensive experimentations and demonstrate the proposed framework's applicability by implementing its critical components using POC (Proof Of Concept) `System App'. We compare the proposed framework with other privacy-protecting approaches and investigate whether it achieves better privacy and functionality for various performance parameters.


I. INTRODUCTION
The online advertising companies enable user tracking with the help of millions of mobile applications (apps) offered via various app markets, profile users, and enable targeted advertising where user's personal information plays an important role. The advertising ecosystems connect billions of mobile devices, including smartphones, tablets, and computer tablets, with the help of various tracking technologies to collect, process and map, and disseminate users' private sensitive information. With the strict laws imposed by the governments, in addition to an abundance of anti-tracking tools/policies, the advertising companies are gradually obsoleting the use of third-party cookies used for (interest-based) ad targeting. Google announcement on Chrome's 'Cookie Platforms (DMPs) and Demand-Side Platforms (DSPs) to brand their data and measure performance in a cookie-less world. In addition, users are more concerned about their data in the long term e.g. Data they leave after death (Dad).
An important factor in online targeted advertising is to deliver relevant ads to users to achieve better view/clickthrough rates without exposing their private and sensitive information, within or outside the advertising ecosystem. In its current working mechanism, user tracking and advertising have raised significant privacy concerns, as suggested in several studies [1]- [6]. Other works show the extent to which consumer's activities are tracked by third-parties and across multiple apps [7], mobile devices leaking Personally Identifiable Information (PII) [8], [9], accessing user's sensitive information via APIs [10], and profile inference attacks based on monitoring ads [11]. Research studies indicate, unless consumers have specifically consented to it, that the ad/tracking companies evaluate user behavior and tailor advertising to reach specific audiences. The American self-regulatory authority, AdChoices program, presents a platform for consumers to opt-out of targeted ads and tracking done by the participating companies in WebChoices. However, this would result in less revenue, in addition to, presenting less relevant ads and lower click-through rates, as evident in [12].
This paper contributes in several ways; the proposed framework protects user's privacy in the mobile advertising ecosystem from 'profiling of specific private attributes' that are potentially used by the analytics companies for tracking user's behavior (over 'web and apps usage behavior') and controls the magnitude of 'ads targeting' based on these attributes. To provide profiling privacy, we note that the private attributes (i.e. user's private interests) during the profiling process dominate user profiles, which can be lowered down to reduce its dominating factor i.e., to produce disturbance in user profiles and hence to control 'interest-based targeting' and temporal changes in the user profiling process. The proposed framework provides an optimal privacy-preserving user profiling that is cost-effective and achieves a tradeoff between user privacy and targeted ads. We recommend the use of (other than the set of installed) apps to flatten the usage pattern of users to protect temporal user's apps usage behavior, which would be of particular scenario when users use mobile phones during business hours. Ideally, these recommended apps can be run when there is no user activity to achieve an average usage of apps to actual usage patterns. However, a major challenge is to evaluate temporal changes in user profiles and apps usage i.e. respectively, the lack of knowledge of time-varying updates in user profiles, which DMP is a unified and centralised technology platform used for collecting, organising, and activating large sets of data from disparate sources. DSP allows for advertisers to buy impressions across several different publisher sites, all targeted to specific users based on key online behaviors and identifiers. See https://www.lotame.com/dmp-vs-dsp/ for detailed discussion over DMP and DSP.
https://optout.aboutads.info/?c=2&lang=EN we classify as updates caused by 'browsing history/searches', 'interactions with ads e.g. view/click', and the types of 'apps installed and used' and 'user's time-varying future apps usage behavior'. Subsequently, using Lyapunov optimisation, we develop an optimal control algorithm for identifying updates in user profiles and the user's usage behavior of actual apps to capture the temporal changes during the profiling process; note that other optimisation mechanisms for dynamic systems can also be used e.g., [13]. User profiles consist of various interests, gathered in similar categories, that are derived based on the user's (private) behavioral activity from utiilising installed apps, activities over the web, and their interactions with ads. Our purpose is to protect attacks on privacy, via app-based profiling (i.e. context profiling) and ad-based profile inferencing (i.e. user profiling based on targeted ads), of selected private (that may be considered private by the user) interest categories in a user profile. E.g. the user may not wish the categories of gambling or porn to be presented in their profiles or to be targeted with relevant ads, which would of particular relevance when business devices are used for private purposes. In addition to privacy protection, which imposes cost overhead by running the recommended apps, the proposed framework minimises the cost introduced, termed as 'resource overhead', by bounding the use of new apps among various boundaries of weightage assigned to interests categories in user profiles.
Furthermore, we investigate the profiling process used by Google AdMob, a major analytics network, during establishing user profile and during when profile evolves over time, by investigating the relation between mobile apps characteristics and the resulting profiles. We carry out extensive profiling experiments by 1.) examining the contribution of individual app in a user profile, 2.) experiments with the recommended apps for protecting user privacy for 'apps usage behavior' and evaluate their effect over user profiles, and 3.) experiments for evaluation of resource overheads; overall these experiments were run for over 5 months. Our experiments show that the mapping of interest categories can be predicted from installed and used apps with an accuracy of 81.4% along with the private and dominating interests categories based on user's activity over a mobile device. In addition, using these experiments, we found that the profiling interests are selected from a pre-defined set of interests categories and that the apps to interests mapping is deterministic, which requires a specific amount of time (up to 72 hours) and a certain level of activity to establish a user profile.
We propose various changes in 'User Environment' in a mobile advertising ecosystem i.e. we suggest 'System App' with the following functionalities: It implements local user profiling based on 'user's apps usage behavior', 'browsing history/searches' and 'user's interactions with ads'; 'System Engine' that keeps local repository and determines recommended apps; Protects user's privacy for their private sensitive attributes; Implements proposed online control algorithm for jointly optimising user privacy and associated cost. Furthermore, we implement a POC (Proof Of Concept) 'System App' of our proposed framework to demonstrate its applicability in an actual environment. As opposed to the legacy advertising operations where users are tracked for their activities; the 'System App' passes anonymous apps usage info/statistics and generated profiles to the analytics servers within the advertising system. In addition, the analytics server in the current advertising system only evaluates stats for apps usage, which are recorded for billing both for ad system and apps developers.
Finally, we provide a hypothetical discussion over the use of other privacy protection mechanisms e.g., differential privacy, cryptographic approaches etc., in the mobile advertising environments and compare the proposed framework with these privacy protection mechanisms for various performance parameters.
The paper is organised as follows: Related work is presented in Section II. Section III presents background on user profiling and the ads ecosystem, proposed addressed problem, and threat model. In Section IV, we present a system model and investigate the profile creation and user profiling process. Section V presents an optimal privacy-preserving system and further discusses the proposed framework. The optimal control algorithm is discussed in Section VI. Various evaluation metrics are discussed in Section VII. Section VIII discusses system evaluation, our experimental setup and results. We further discuss the applicability of the proposed framework and its comparison with other privacy protection mechanisms in Section IX. Finally we conclude in Section X.

II. RELATED WORK
In our recent work [1] we provide a detailed survey on privacy leakages in the profiling process, leakage of personal information by advertising platforms and ad/analytics networks, the measurement analysis of targeted advertising based on user's interests and profiling context, compare various privacy-preserving advertising systems for various capabilities, such as the underlying architecture used, the privacy mechanisms and the deployment scenarios, furthermore, we present detailed discussion over ads delivery process in both in-app and in-browser targeted ads. Privacy threats in targeted advertising have been extensively studied in literature e.g., direct and indirect (inferred) leakage of private information [3], [14]- [16] or third-party ad tracking and visiting [17]- [20]. These works show the prevalence of user (from their online data) tracking on both web and mobile environments and demonstrate the possibility of inferring user's PII, such as email addresses, age, gender, relationship status, etc. The ad/analytics libraries (embedded with mobile apps) leak users' personal information to the ad system and (is more likely to) third-party tracking, which is systematically collected by such libraries. Such example works [21], [22] study information collected by analytics libraries integrated within mobile apps and leaking of private information. Similarly, the authors in [23] show privacy and security risks by analysing 100,000 Android apps and find that majority of the embedded libraries collect and share private information. Other studies [24], [25] find that majority of the mobile apps do not implement private APIs and send private information to ad servers.
Recall that user profiling and ads targeting is carried out based on user's activity over apps and internet; the authors in [26] manually create user-profiles and examine that majority of targeted ads were based on the fabricated profiles. In our previous work [27], and later supported by another study [28], we examine various tracking information sent to the ad networks and the level of ads targeting based on such profiling information. Another work [16] investigates the information collected by ad networks using installed apps and reverse engineer the targeted ads. In line with the above works, we also study [11] leakage of private sensitive information by examining actual network traffic through vulnerabilities in mobile analytics services, reconstructing the exact user profile of several participating volunteers, and further studying its influence over ads targeting. This study was mainly for Google Mobile Analytics and Flurry Analytics. Third-party tracking also actively collect, manage and distribute user's sensitive information, which has been widely studied in literature e.g. [19], [29]- [33] study the distribution of third-party trackers across the web and Android apps and their impact on user privacy.
Other works [34]- [38], suggest the app-based user profiling, stored locally on mobile device. There are various in-app privacy-preserving mobile advertising and personalisation proposals, such as, Adnostic [39], Privad [40], RePriv [41], MobiAd [36], Splitx [42], ProfileGuard [35]. The Profile-Guard is an app-based profile obfuscation mechanism for protecting user privacy using various obfuscation strategies. However, this work only protects user privacy by reducing dominance level of particular profiling interests, which has greater impact over the targeted ads and possibly would attract majority of the irrelevant ads. Similarly, there are some other solutions that are crypto-based implementation of various techniques under Private Information Retrieval (PIR) [1] and Blockchain-based solutions [34], [43] for decentralised advertising system that enables private profiling and targeted ads.
We note that majority of the app-based privacy protection mechanisms protect user privacy based on profiling interests whereas disregarding various other factors; such as user privacy based on an application profile, the types of apps installed and used along with their time-varying future usage behaviour, frequency of apps usage, users' web searches, and user interactions around the ads e.g., clicks. Furthermore, we note that these works do not consider the trade-off between user privacy, the targeted ads, and the cost to achieve user privacy. This work jointly optimises user's profiling privacy, the user's apps usage behavior, and cost of achieving the level of privacy. We provide an online control algorithm that provides a trade-off by achieving between user privacy and targeted ads. This was achieved by optimally identifying updates in user profiles and user's apps usage behavior of actual apps by capturing temporal changes during the profiling process. We mainly address the attacks regarding appbased profiling, ads-based inferencing, and analysing user's behavior by sniffing network traffic of legitimate users.

III. PROBLEM FORMULATION
The Advertising and Analytic (A&A) companies rely on users' tracking data to profile and to target them with relevant advertisements, both 'Web & App'-based to cover the vast majority of the audience of diverse interests, to increase their advertising market revenue. This exposes sensitive information about users, in particular, their online behavior e.g., web browsing/searching, or when profiling is based on apps representing sensitive and damaging information, e.g., gambling problems indicated by a game app, or the mobile apps usage behaviors e.g. playing games or use of gambling apps early morning in bed or during lunch break in-office hours. We then present the privacy issues related to the profiling process for both 'Web & App' activity and app usage behavior, subsequently, we present the problem and threat model discussed in this paper.

A. USER PROFILING
The advertising companies, e.g., Google, profile users based on information a user adds to the Google account, its estimation of user's interests based on the mobile apps usage and web histories, and data collected from the partnering analytics companies with Google, which effectively carry out targeted adverting based on the personal information extracted through various tracking technologies. Figure 1 shows an example user profile, estimated by Google, which consists of demographics (e.g. gender, ageranks) and profiling interests e.g. Action & Adventure Films. We call this an Interests profile with all those interests defined by A&A companies (e.g. Google) and is used by them to individually characterise user's interests across the advertising ecosystem. Similarly, we introduce Context profile that is the combination of apps installed from various categories, e.g. Games, Entertainment, Sports etc., on a user mobile device; detailed discussion over Context profile is given in Section IV. We note, using our extensive experimentations [35], that Context profile profile is also (partly) used by the analytics companies to individually characterise user's interests across the advertising ecosystem.
Furthermore, the ads targeting is based on demographics to reach a specific set of potential customers that are likely to be within a specific age range, gender, etc. Google presents a detailed set of various demographic targeting options. The demographics are usually grouped into different categories, Google AdMob profile can be accessed through the Google Settings system app on Android-based devices; accessible through Google Settings → Ads → Ads by Google → Ads Settings.
Demographic Targeting: https://support.google.com/google-ads/answer /2580383?hl=en with specific options such as age-ranges, e.g. '18-24', '25-34' etc., and gender e.g., 'Male', 'Female', and other demographic targeting options e.g. parental status, location etc. We note that this profiling is the result of interactions of user devices with AdMob SDK [35] that communicates with Google Analytics for deriving user profiles. A complete set of 'Web & App' activities of an individual user can be found under 'My Google Activity', which helps Google make services more useful. Figure 2 shows various sources/platforms that Google uses to collect data and target users with personalised ads. These include a wide range of different sources enabled with tracking tools and technologies, e.g., the 'Web & App' activities are extracted with the help of Andoird/iOS SDKs, their interactions with Analytics servers within the Google network, cookies, conversation tracking, web searches, user's interactions with presented ads, etc. Similarly, Google's connected home devices and services rely on data collected using cameras, microphones, and other sensors to provide helpful features and services. The tracking data (up to several GBs of data), personalised for individual users, can be exported using Google Takeout for backup or use it with a service outside of Google. This includes the data from a range of Google products, such as email conversations (including Spam and Trash emails), contacts, calendars, browsing and location history, and photos.    apps on their mobile devices, that are utilised with specific frequency. The mobile apps include analytics SDK, which directly reports user's activity (as mentioned in above section) and sends ad requests to the analytics and ad network. Various advertising entities play important role in enabling tracking and ads dissimilation in an ad system, comprises the Aggregation, Analytics, Billing, and the Ads Placement servers. Collected tracking data is used by the Analytics server that constructs Interests profiles (associated with specific mobile devices and corresponding users) with specific profiling interests related to user's (private) behavior. The targeted ads are served to mobile users according to their (individual) profiles. We note that other i.e., generic ads are also served [27]. The Billing server includes the functionality related to monetising Ad impressions (i.e. ads displayed to the users in specific apps) and Ad clicks (user action on presented ads).

C. RESEARCH STATEMENT
The problem addressed in this paper is where the A&A companies track mobile users for their activities, profile them (inferred via relationships among individuals, their monitored responses to previous advertising activity and temporal behavior over the Internet), and target them with ads specific to individual's interests. The user profiling and ads targeting expose sensitive information about users [1], e.g. the target could browse through medical related websites or apps, revealing (including a third-party, such as the website owner) to the advertising systems that the user has medical issues.
Furthermore, we address the privacy issues where an adversary (either the analytics companies examining user activity or an intruder listening to the ad or control traffic) (6) (3) (1)  (1) Data collection and tracking, (2) Send tracking data to Aggregation server, (3) Forward usage info to Analytics server, (4) User profiling, (5) Send profiling info to APS, (6) Deliver targeted/generic ads, (7) Billing for Apps Developer, (8) Billing for Ad System, (9) Advertiser uploads ads, who wishes to advertise with Ad system.

ADVERTISING SYSTEM USER ENVIRONMENT
can determine the app's usage activity e.g., someone plays games during late night or early morning and their activity is intercepted by their neighbors. Note that the user's apps usage activities can be exposed by intercepting the apps communication using the connected network [11], in addition, users' activities are exposed to the advertising systems during their interactions using ad/analytics SDKs. Figure 4 shows an example apps usage profile (in plaintext) for a typical day, showing a direct threat to user's privacy for apps usage.  In particular, we address three privacy attacks: 1. Legitimate user profiling by A&A, the user profiling is implemented via an analytics SDK for reporting user's activities to A&A companies, hence, intercepts various requests to/from mobile users. 2. Indirect privacy attack, involves third parties that could intercept and infer user profiles based on targeted ads. We note that an adversary (e.g., an intruder listening to ad traffic) can determine the app's usage activities (e.g., user using gambling apps), which can be exposed by intercepting the apps communication of the connected network [11]. In addition, the adversary may intercept users' other interactions e.g., interactions with ads for views/clicks, web searches, data communicated via connected devices or sensors. 3. Apps usage behavioral attack to know user's apps usage activities. Alternatively, such apps usage activities are exposed to the A&A companies during their interactions using ad/analytic SDKs, embedded in mobile apps [1], [27]. For this reason, using Lyapunov optimisation, we develop an optimal control algorithm for identifying updates in user profiles based on apps usage behavior for their temporal changes during the profiling process. We presume that the users do not want to expose their private interests to adversaries (including advertising agencies) and are willing to receive relevant ads based on their interests.

D. THREAT MODEL
Our primary goal in this work is to achieve an optimal privacy-preserving profile management that preserves user's privacy due to profiling interests derived via apps usage, privacy in terms of apps usage behavior, users' web history/searches, and their interactions with ads. We start by analysing and developing the profiling process, identify the dominating interests that expose user privacy and could affect ads targeting, the proportion of interests in user profile, and the apps usage activity. We further compare proposed solution and its applicability with other privacy preserving mechanisms such as differential privacy, anonymisation, randomisation, profile-based obfuscation Blockchain-based solutions and crypto-based mechanisms e.g., private information retrieval) in an advertising scenario for the problems addressed in this work. Finally, we evaluate the trade-off between the achieved user privacy, cost of achieving privacy and targeted ads. Hence, in this paper, we jointly optimise the user's privacy due to profiling interests, the cost of achieving privacy, and the apps usage behavior.

IV. SYSTEM COMPONENTS
We formalise the system model that consists of the apps' profiles, interests' profiles, and the conversion of resulting profiles by use of applications in an app profile. In particular, we provide the insights of establishment of Interests profiles by individual apps in the Context profiles and then show how the profiles evolve when some apps other than the initial set of apps are utilised.

A. SYSTEM MODEL
We denote an app by a i,j , i = 1, ..., A j , where A j is the number of apps that belong to an app category Φ j , j = 1, ..., φ and φ is the number of different categories in a marketplace (e.g., in Google Play or in the Apple App store). For example, there are several apps categories φ in Google Play Store, such as 'Art & Design', 'Books & Reference', 'Entertainment' etc.; we reference each category with an index j. Similarly, an individual category j may have numerous applications, which can be downloaded and used, e.g., the 'LinkedIn Learning: Online Courses to Learn Skills' app is categorised under the category 'Education', which is represented with the index i. Hence, for the app 'LinkedIn Learning: Online Courses to Learn Skills', the a i,j can be interpreted as 'this app is indexed i and is categorised under the j th category'.
In addition, we note that an individual app a i,j is characterised by a set of keywords κ i,j that includes app's category (e.g. Business, Entertainment etc.) and dictionary (specific to this particular app) keywords from its description. We represent with A the entire set of mobile apps in any marketplace, organised in various categories.
A user may be characterised by a combination of apps installed on their mobile device(s), comprising a subset S a ∈ A. For example, a subset S a may comprise of various apps e.g., 'LinkedIn', 'Outlook', 'Uber', 'Zoom' etc. Subsequently, the Context profile K a can be defined as: The A&A companies, such as Google or Flurry, partly profile and target users based on the combination of mobile https://play.google.com/store apps installed on their devices i.e. Context profile. We have the following constraints on the user's Context profile: The n (K a ) is the total number of apps installed on a device, hence, using Eq. (2), it is important to make sure that the n (K a ) should not exceed the total number of available apps; that a mobile device must have at least one app installed for contextual targeting; that an app a i,j must belong to any of the specified categories; and that the app's category is not undefined within an app market. Various important notations and their descriptions are presented in Table 1.

B. REPRESENTING APPS IN USER PROFILE
We note that the A&A companies classify users by defining a set of profiling interests G, i.e., characteristics that may be assigned to users i.e. an Interest profile. E.g. Google profile interests are grouped, hierarchically, under various interests categories, with specific interests. We denote, using g k,l , k = 1, ..., G l , where G l is total number of interests that belong to an interest category Ψ l , l = 1, ..., ψ. ψ is a total number of interest categories defined by analytics companies. An individual interest g k,l consists of a set of keywords κ k,l , which characterises specific interests. Following, to enable interest targeting, represents an Interests profile I g , consists of a subset S g of specific interests: Although various types of information may be used to generate user profiles, as shown in Figure 2, however, our focus is mainly on installed and used/unused apps, the history resulted from web browsing or web searches, and the clicked ads i.e. collectively described as the 'Web & Apps' activity. Similarly, other targeting criteria can also be represented in an Interest profile as a specific interest e.g. demographics, contributing analytic platforms, social media, and other Google services. An example demographics interests is also shown in Figure 1.
The targeting components are eventually grouped in Interest profile that are used for ads targeting. The Interest profiles undergo different processes e.g. the profile establishment process (i.e. the derivation of Interests profile by apps; K a → I g ) (generating a specific set of interests Full list of installed apps is located at: /data/data/com.android.vending/d atabases/library.db. users' behaviour are observed), and profile development process (i.e. the minimum level of activity required to develop a stable user profile I f g ) i.e. I f g = I g ∪ I g . Figure 5 shows user profiling process, detailed discussion over these various processes of user profiling can be found in [34]. Furthermore, detailed experimental evaluations over insights on profiling rules, apps usage threshold during profiling establishment, and mapping rules for Context profile to Interest profiles can be found in [35].

Establishment Stable Stable Evolution
FIGURE 5: Profile establishment & evolution processes. I ∅ is the empty profile before any activity over user device. During the stable states, the Interest profiles I g or I f g remains the same and further activities of the same apps have no effect over the profiles [34].
In order to represent part (along with the dominating interest categories) of each interest category, we assign weightage to each category Ψ l present in the final user profile I f g ; represented as η l (Ψ l ), under the following two constraints: Hence, weightage given to an interest category η l (Ψ l ) is within the η min l and η max l threshold weightages. Subsequently, the user profile is represented as, I g = η l (Ψ l ) , ∀l ∈ ψ.
We note, using our extensive profiling experimentations [35], that the set of installed apps, on a mobile device, do not necessarily contribute during profiling, in which case specific apps do not draw any interests i.e. an empty set. In addition, Eq. (5) ensures that the assigned threshold should be nonnegative and cannot exceed the maximum threshold within a user profile.

Example -Assigning Weightages
Let a user has five mobile apps installed on her device i.e. three from category a and one from category b and c each, respectively with the following percentage of usage time: a = 20%, b = 70%, c = 10%; for simplicity we do not consider apps that are installed but not used, which can be assigned the lowest weightages e.g. 1/n (K a ) including system installed apps. The weightage for each category η l (Ψ l ) can be evaluated as follows: η l (a) = 3 / 5 + 20 / 100 = 0.8; η l (b) = 1 / 5 + 70 / 100 = 0.9; η l (c) = 1 / 5 + 10 / 100 = 0.3.
Use adb logcat -v threadtime on an Android-based device to find the running time of an app; threadtime (default) is used to display the date, invocation time, tag, priority, TID, and PID of a thread issuing the message.

A
Available apps in a marketplace φ Total number of apps categories, Φ j is a selected category, j = 1, ..., φ Sa Subset of apps installed on a user's mobile device Set of keywords associated with individual app a i,j including its category Ka App profile consisting of apps a i,j and their categories Φ j G Set of interests in Google interests list Ψ l Interest category in G, l = 1, ..., ψ, ψ is the number of interest categories defined by Google Sg Subset of Google interests in G derived by Sa Ig Interest profile consisting of g k,l , g k,l ∈ Sg g k,l An interest in Ig, k ∈ G l , l ∈ ψ κ k,l Set of keywords associated with individual interest g k,l including interest category S i,j g Set of interests derived by an app a i,j f Mapping that returns the derivation of apps (or history, ad's category) category to interest category η l (Ψ l ) The weightage assigned to interest category in Ψ l in a profile with, respectively with η min l and η min l thresholds Πm Profiling components for representing browsing history/searches Γn Profiling components for representing interactions with ads t Time slot Tracking change in interest category due to changes in apps usage Ct (ηm (Πm)) Tracking change in web browsing history/searches Ct (ηn (Γn)) Tracking interactions with ads U τ (Ka) Usage of apps in Ka at time slot t; average usage of each app is given by The weightage assigned to interest category generated by recommended app(s), where l = l So Set of recommended apps C t Reduction of overall user's time available by use of recommended apps to the original apps R t Resource usage of recommended apps, in particular, R t b , R t c , R t p respectively represent the battery consumption, communication, and processing resource usage β Adjustable parameter to achieve trade-off between user privacy and targeted ads R t l An advertising request reported at t via ad/analytics SDK e.g. an ad request for display/ad click ε Adjustable parameter to achieve apps usage to preserve privacy of user's app usage behavior The penalty for minimising the upper bound on C t , profiling privacy, and apps usage behavior Hence, it requires that the user is supposed to be targeted with the ads related to category b with highest proportion, followed by a and c. Note that for simplicity, these weightages can be normalised within the range 0 − 1; subsequently, the proportion for delivery of targeted ads would be: a = 0.40, b = 0.45, and c = 0.15.

C. REPRESENTING BROWSING HISTORY/SEARCHES AND AD-INTERACTIONS IN A USER PROFILE
We respectively represent Π m and Γ n profiling components as user history/search and ad-interactions (e.g. ad click). Subsequently, we assign weightages to both these components: Note that the minimum threshold, on the left sides of Eq. (6) & (7) i.e. taken as a minimum of 'η min l (Ψ l )' and '0', to respectively represent its weightage as higher than the minimum weightage of present interests (and hence to show its importance in current targeting components) or lower its dominating factor to slightly higher than '0'. Subsequently, we have the following equivalent user profile, as a representation of weightage of all the targeting criteria,:

D. PROFILE UPDATING
An important factor in ads targeting is to track user's activity to find temporal changes in a user profile; hence, the profile and ads targeting is updated each time variations in user behavior is observed i.e. the targeting criteria result in interests other than the existing set of interests. We note that following criteria (but not limited to) is used to track changes in a use profile: installation/un-installation of an app i.e. a user uses new set of apps S a , which has no overlap with the existing set of apps S a , increase/decreases of use of an existing app, start use of an 'un-used' app, interactions with ads, and changes in web browsing history/searches. Let η min l and η max l are new minimum and maximum threshold due to changes in apps usage, then following conditions may hold (provided that the usage time is i.i.d i.e. independent and identical distributed, with some unknown probability distribution): In addition, such changes in browsing history/searches and interactions with ads can also be expressed with threshold limits, as explained in the next section.

E. PROFILE EVOLUTION
Furthermore, subsequent changes in a user profile (w.r.t. time t) are represented as C t (η l (Ψ l )); C t (η m (Π m )) and C t (η n (Γ n )) respectively for changes in apps usage, web browsing history/searches and interactions with ads. Hence, Eq. (8) can be re-written as: Since the assigned weightages should always be nonnegative and bounded by maximum thresholds (i.e. maximum convergence point in a specific period e.g. of every 24 hours), hence we need to make sure the following: (14) 0 < C t (η n (Γ n )) ≤ C max t (η n (Γ n )) ; ∀t (15) As mentioned earlier, the other profiling components, such as history/searches and interactions with ads, are also mapped to profiling interests Ψ l , to produce a unified profile with different dominating interest categories, e.g. for adinteractions: f : Γ n → Ψ l = S Ka {Ψ l } ψ l=1 . We presume that all changes in user profiles are distributed with an unknown probability distribution and that these profile weightages are deterministically bounded by a finite constant e.g. C max t (η m (Π m )) for web history, so that (incorporated in user profile) at time slot 't', Eq. (12) can be re-written, as a unified profiling interests, as: Similarly, the newly updated (or change in existing) interest categories are reflected in profile at 't + 1' as: I t+1 g = I t g + C t+1 (η l (Ψ l )). For n amount of time: I t+n g is the maximum convergence point of a user profile i.e. the point where users are targeted with the most relevant ads. We envisage that this information is regularly updated, e.g. once per 24 hours, which we call evolution threshold i.e. the time required to evolve profile's interests, and is used to reflect the updated profile and fetch associated targeted ads: (18) Recall that profile at this stage, as t → ∞, due to maximum profile convergence, it contains the dominating interests g k,l that are considered private by user, and hence exposes user's privacy. Our goal is to design a control algorithm that jointly optimises user's privacy (both profiling and app usage activity at different time of the day/night, as detailed in next section) and cost of recommended apps (discussed in Section V).

F. APPS USAGE PROFILE
Let U t (K a ) represents the apps usage at time slot t (during day/night, note, for simplicity we do not consider the app usage time for individual apps). The apps usage behavior for every user is time-varying, e.g. some users play game/gambling apps during the lunch time or early in bed when she gets up, or an employee might scroll through the stocks in a broker's application during lunch break, to be exposed to her employer. We note that the user's apps usage activity can be intercepted from the connected network [11] or through app's interactions with the advertising systems [27], [34]. Let lim t→∞ t τ =1 U τ (K a ) denotes app's usage at different time slots. The app's usage time varies during the 24 hours for each user, which also exposes user's privacy (irrespective of the category of app's usage) in terms of use of apps during various times of the day/night.

V. OPTIMAL PRIVACY PRESERVING PROFILING
Based on the above requirements for user profiling, we now study optimal privacy-preserving profile management that is cost effective, preserve user's privacy due to profiling interests and the apps usage behavior, in addition to, user's web history/searches and interactions with ads. Figure 6 presents detailed overview of the proposed framework, introducing changes to 'User Environment' that implements local user profiling both for Context profile and Interests profiles and further protects their privacy, presented in Section V-B, similarly, preserves user's privacy for user's apps usage behavior during day/night time, detailed in Section V-C, implemented via 'System App'. We implement these various functionalities via a Proof of Concept (PoC) mobile app; detailed discussion is given in Section VIII-A. The 'System App' also implements this framework that jointly optimises the profiling process and preserves user privacy in a cost-effective way, as detailed in Sections V-D, V-E, and V-F. We suggest that this framework can be integrated into AdMob SDK since the current ad ecosystem carries out user profiling and targeted ads via SDKs, which will require SDK modifications. (6) (3)  FIGURE 6: The proposed advertising system with changes in user and advertising environments.

B. PROTECTING SENSITIVE PROFILING INTERESTS
To protect the private interest categories i.e. sensitive to users, we select various other apps based on similarity metric [35] to reduce the dominating private interest categories present in a user profile. This metric is calculated based on app keywords κ i,j using the tf − idf (cosine similarity) metric [27]. This strategy is not metric-specific hence other similarity metrics e.g., jaccard index, can also be used. The newly selected candidate apps, which we call the obfuscation apps (note that in this work, we interchangeably use both 'recommended apps' and 'obfuscation apps'), are selected (and run for specific amount of time, as described in Eq. (23), (24), and (25)) from apps categories Φ j , j = 1, ..., φ other than the private category Φ p ; Φ p is considered private by users that they want to protect. The recommended set S o comprises those apps with highest similarity to the existing set of apps i.e. S a . In addition, the user may protect any number of private interests categories Ψ p , p = 1, · · · , Ω and Ω is the set of interests categories that are private to the user. Furthermore, we presume that the selected app(s) will always generate the profiling interests other than the private profiling interest(s) i.e. l = l.
We assign weightage η l (Ψ l ) to the newly generated pro-Detailed discussion over profile obfuscation to achieve user privacy in an app-based user profiling, using various obfuscation strategies, can be found in [35]. filing interests, I t g , in order to reflect its effect over 1. privacy (i.e. disruption in user profile) and 2. targeted ads (disruption in receiving targeted ads based on private profiling interests, exposed to ad/analytics networks): Cost is usually defined as the ratio of obfuscating to original data [44]. We further elaborate on Cost in Section VII-A2. Recall that, beyond privacy protection, the use of these recommended apps is used to protect user privacy of apps usage behavior e.g., during the 24 hours period.

C. CONTROL OBJECTIVE FOR APPS USAGE BEHAVIOR
As mentioned in Section IV-F, intuitively speaking, in order to achieve user privacy for apps usage at different time slots t of the day, the apps usage profile U t (K a ) needs to be 'flatten' as t → ∞, as much as possible, by running additional (recommended apps, as described in above section) apps i.e. U t (K a ). Subsequently, the profile becomes: U τ (K a ) represents the average apps usage for a user K a , alternatively U t (K a ) = C t (η l (Ψ l )), ∀l ∈ [1, ψ]. In real time, at different time slots, the apps usage needs to be controlled with as little deviation from C t (η l (Ψ l )) as possible. Note that this is also applicable to the interests in I t g generated via browsing history/searches and interactions with ads, since, as mentioned earlier that Π m → Ψ l and Γ n → Ψ l , in addition, using both these activities, the user devices interact with the ad/analytic networks for tracking user's activity along with their (device's) usage behavior.
Subsequently, the control objective is to minimise the variance of U t (K a ), i.e.: The privacy protection in apps usage behavior, in addition to, preserve user's privacy due to profiling interests, can be achieved by using a few other suggested apps, described in the next section, for our proposed scenario.

D. OBJECTIVE FUNCTION
In this paper, we jointly optimise the user's privacy due to profiling interests, the cost of running obfuscation apps, and the apps usage behavior. The objective function can be expressed as: Cost of running obf s. apps Apps usage privacy       (22) Here, the β parameter is selected by user in order to achieve a trade-off between user privacy and targeted ads. Note that the selection of this parameter affects the targeted ads as a result of disruption in a user profile. We describe the following various scenarios for introducing (along with the number of apps) recommended apps: The first scenario (i.e. Eq. (23)) introduces new obfuscation apps that would introduce minimum disruption in a user profile, hence, achieves lower privacy and attracts higher targeted ads, as opposed to the last scenario, which introduces highest disruption in a user profile i.e., achieves higher privacy and attracts less relevant targeted ads. On the other hand, the middle scenario introduces medium disruption in a user profile and achieves a balance between user privacy and targeted ads. An empirical example for various scenarios is given in Figure 7. We envisage that this scenario (24) will further introduce medium operating cost of the selected obfuscation apps i.e. C τ η l (Ψ l ). Newly created profile interests 10% FIGURE 7: Trade-off between privacy and targeted ads achieved via various obfuscation scenarios of Low, Medium, and High profile (interests), respectively introducing 10%, 30%, and 80% disruption in a user profile.

E. PROBLEM FORMULATION
The optimal privacy preserving user profiling and targeted ads can be formulated as a dynamic optimisation problem: s.t. constants: (13), (14), (15), (17), (24). An important challenge to solve this optimisation problem is to know the user's temporal behavior as a combined activity of 'Web & App' i.e. the lack of knowledge of timevarying updates in a user profile. This change in temporal behavior is affected by the profiling derived by app's usage Eq. (13), browsing history/searches Eq. (14), and interactions with ads Eq. (15). This problem would become even more challenging when there is an irregular activity observed by a user e.g. high app's/web usage during weekends etc. Hence, we develop an optimal control algorithm, using Lyapunov optimization, for identifying updates in user profiles as a result of communication requests between the mobile device and the advertising system (analytics) entities, see Section VI. Recall that ad/analytics SDKs enable these requests for tracking and profiling users for their 'Web & App' activities.

F. PROBLEM RELAXATION
To solve the optimisation problem (26), we consider its relax version using (16) to relax constraints (13), (14), and (15). The average expected change in a user profile I t g is given by: From Eq. (16), it is clear that the profile evolves over time e.g. over time slot t and t + 1 we have I t g = C t+1 (η l (Ψ l )) + C t (η l (Ψ l )). Hence, we take its expected values on both sides of Eq. (16) and equate it to Eq. (27), we have: E {C t (η l (Ψ l ))} = C t−1 (η l (Ψ l )) + C t (η l (Ψ l )) (28) Recall that C t (η l (Ψ l )) is the initial weightages of profiling interests as the profile evolves. Similarly, as mentioned earlier that, these changes are bounded by finite min and max bounds i.e.: Divide both sides of Eq. (28) by t and take t → ∞. Following: Consequently, we have the following relaxed version of objective function: s.t. constants: (16), (17), (24). Now the main challenge of solving objective (31) is to minimise the variance of U t (K a ) in order to protect user's app's usage behavior i.e. to know all the future usage of apps and hence suggest the (automatic) use of recommended obfuscation apps i.e. U t (K a ). Similar to Eq. (30), we can also show that U t (K a ) = 0 i.e. to prove that decision variable is independent of U t (K a ). Hence, the above optimisation problem can be solved without the information of U t (K a ) at different time slots t, as detailed below.

Proof
From Eq. (21), it can be shown that: The above equation is independent of the choice of U τ (K a ), hence: As shown in Eq. (20), sum over all t, take the expectation of both sides and take t → ∞, we have the following: = I t g + U t (K a ) From Eq. (30), we conclude: The relax version of our objective function in Eq. (31) can be re-written as: (36) Note that this is an optimised version of (24) i.e. the scenario of introducing medium disruption in a user profile and it achieves balance between user privacy and targeted ads. The other scenarios of low (23) and high (25) profile disruption will have no effect over the (36) except low and high costs and trade-off between privacy and targeted ads.

VI. OPTIMAL CONTROL ALGORITHM
We design a control algorithm for identifying communication requests between the mobile device and the analytics server within an advertising ecosystem, to achieve an optimal solution to (26) for identifying time varying updates in user profiles. Let R t l represents an advertising request reported at t via ad/analytics SDK i.e. either ad request for display/ad click in apps/web or web searches/history; recall that a request may or may not introduce l (l is profiling interest category introduced during profile updating or evolution) in a user profile I t g , alternatively, a change in user profile with an addition of C t (η l (Ψ l )) during the profile evolution, as described in Section IV-E. Hence: Subsequently, the R t l is a shifted version of I t g and can be described as:

A. LYAPUNOV OPTIMISATION
For n requests, we have R t l = R 1 l , · · · , R n l , following, the quadratic Lyapunov function for each t is given as: Similarly, the corresponding Lyapunov drift i.e. drift-pluspenalty, can be defined as: To stabilize the upper bound on Eq. (36), while effective profiling process i.e. the number of interests drawn, to minimise cost of recommended apps, and to minimise the variance of U t (K a ) for preserving user privacy of apps usage behavior, the control algorithm can be designed to minimise the following drift-plus-penalty on each time slot t: Here, p (t) is the penalty for minimising the upper bound on Eq. (36) i.e. minimise p (t) = E C t η l (Ψ l ) + β I t g + η t l (Ψ l ) + βI t g , V > 0 is a non-negative parameter chosen as a desired effect of performance of trade-off over the objective function. This approach does not require the knowledge of all future events i.e. the lack of knowledge of time varying updates in a user profile and the apps usage behavior.

Lemma
The upper drift bound. For the control policy that satisfies the constraints on Eq. (36), we have the following drift-pluspenalty condition holds: (R τ l ) (42) ε is similar to β and can be controlled to achieve the average apps usage time to preserve privacy of user's app usage behavior and to achieve a trade-off between user privacy and targeted ads. p (t) is the desired target for the time average of p (t). B can be defined as: Taking expectations of both sides of above drift-plus-penalty, we have: Dividing by V t and rearranging terms proves the bound on average penalty, concludes the proof.

B. CONTROL ALGORITHM
The main objective of the control algorithm is to minimise the drift-plus-penalty bound subject to the constraints of (36) at each time slot t. Detailed description of the control algorithm can be found in Algorithm 1. This control algorithm optimally selects the minimum and maximum bounds over the selection of obfuscation apps, the rate at which these apps needs to run according to the variation in R t l and by observing the current states of I t g i.e., a solution to the following optimisation problem: The performance of this algorithm is to achieve minimised objective when changes occur in the profiling process at t i.e. the stable and during the profile development and profile evolution processes, in order to solve (47) as a mixed-integer non-linear programming optimisation problem. The following cases:

Stable State
Recall that during this state, no changes occur in the profiling process i.e. C t (η l (Ψ l )) = 0. Hence, the optimal value of (47) is evaluated to: Let p (R t l ) tracks the advertising requests reported at t via ad/analytics SDK during slot t, which is considered minimum i.e. p min (R t l ), during the stable state, hence, the η l (Ψ l ) is selected as η l (Ψ l ) = min 0, η min l (Ψ l ) .

Profile Development/Evolution State
The profiling process speeds up during this state as a result of high interaction with the analytics servers. Let p min (R t l ), p avg (R t l ), and p max (R t l ) respectively represents the minimum, average, and maximum; following we present various scenarios for calculating optimal value of (47): Algorithm 1 The control algorithm for joint optimisation of privacy and cost of in-app mobile user profiling and targeted ads.
The current state of I t g and evaluate the problem using Eq. (47) min R t l + V C t η l (Ψ l ) + β I t g + η t l (Ψ l ) + βI t g in the presence of constraint defined in Eq. (47). 11: For the stable state of a profile, update the profile using Eq. (48), V (C t η l (Ψ l ) + βη t l (Ψ l )). 12: For profile development and evolution states, update the profile using Eq. (49), p (R t l ) + V C t η l (Ψ l ) + β I t g + η t l (Ψ l ) + βI t g . 13: For various scenarios of minimum (p min (R t l )), average (p avg (R t l )), and maximum (p max (R t l )) values of p (R t l ), update the value of R t l using Eq. (38).
Given these values, the optimal objective can be calculated as: The lower values on (48) and (49) and the corresponding control values are calculated to evaluate the optimal values of our objective.

VII. PERFORMANCE MEASURES
We now analyse the feasibility and performance of the proposed model using various evaluation metrics, in addition, we discuss the POC implementation of our proposed model. The applicability of the proposed system is discussed in Section IX-C.

A. EVALUATION METRIC
We define utility and further elaborate the cost of recommended apps to provide insights on usability of the recommended apps usage. The authors in [44] describe utility as the success rate for removal of private query tags i.e. the magnitude suppression of original user preferences.

1) Utility
Let D (η p (Ψ p )) is the dominating private interest category. We define utility based on two components: first, from the effectiveness of privacy protection, we use as metric the level of reduction R p of η p (Ψ p ) of a selected private category Ψ p in an Interest profile, achieved via recommended obfuscating apps, with D(ηp(Ψp)) D(η l (Ψ l )) . Here D (η l (Ψ l )) is the new dominating ratio resulting from using apps in S o (S o is the new set of apps other than S a ; the selection of these apps has been detailed in Section V-B).
Secondly, we introduce usability U s of the selected obfuscating apps that relates to the probability that a user would actually utilise these apps, rather than just install and run it for privacy protection. The U s of an app a o ∈ S o , in regards to a user with a specific Context profile, as the ratio of similarity between this app and any of the apps in S a and the maximum similarity of any other app from A, not present in S a and apps from the same set: Combining the two, the total utility can then be calculated as: U T = R p + U s .

2) Cost and resource overhead
The cost and resource overhead, can be considered as equivalent terms in the context of introducing new activities, which result in usage time and other resources e.g. battery usage, however, for the sake of clarity, we use two separate terms. As mentioned earlier, cost C is termed as a metric that relates to the reduction of overall usage time available (by use of obfuscating apps) to original (non-obfuscating) apps.
In the basic scenario where, the average introduction of recommended apps, the usage of all apps is uniformly distributed within a time period in correspondence to U t (K a ); this is equivalent to the ratio of the number of obfuscating apps in S o (that need to be installed and used to protect privacy) and the size of original apps set S a . Cost is therefore defined as: We consider the resource overhead R, to be the overall resource usage by running the recommended apps. Hence, there will be a corresponding overhead R t i,j for each app a i,j at time slot t, comprising broadly of communication R t c (a i,j ), processing R t p (a i,j ), and battery consumption overheads R t b (a i,j ).
Following, we present further detail on resource usage and then experimentally evaluate various components (see Section VIII-E) of the overall resource overhead R.

B. EVALUATING RESOURCE USE
We rely on various utilities of Android SDK to automate various measurements for processing and battery consumption. As an example, we execute adb shell top -m 10 within the Process p = Runtime.getRuntime().exec("command") to determine CPU processing of each running app. Similarly, to evaluate battery's current status, we use adb shell dumpsys battery | grep level command by initiating the startActivity() of the Intent utility of Android SDK. To measure this, we first charge the battery to 100% and then run the app for one hour, while connected to WiFi network. We envisage that running of recommended apps would be used over WiFi to reduce communication costs (specifically for users with limited mobile network packages), although, in real life scenario, users are likely to utilise apps (equally) on a mobile network that would results different magnitudes of resource overheads.
Similarly, we utilise the traffic captured during our experiments to evaluate communication overhead; described in Section VIII-B.
In addition to above, we evaluate another resource usage overhead i.e. storage space consumption. This can be further classified into installation storage space, cache storage i.e. temporary stored data, e.g. cookies stored on phone, internal data size i.e. storage used for apps' files, accounts, etc., which are respectively calculated using codeSize, cacheSize, and dataSize of PackageStats package of Android SDK. Note that we automated all these through an app, and all the experimentation discussed in next section, using Android Debug Bridge of Android SDK that enables communication between a PC and connected Android devices.

VIII. PERFORMANCE EVALUATION
We now discuss details of various components of 'System App' implementation and experimental evaluation and further provide insights on resource overheads.

A. SYSTEM APP: THE OBFUSCATING SYSTEM
We have implemented a POC 'System App' of our proposed framework. Various components are presented on left side of Figure 6 i.e. 'User Environment', which introduces changes to the user side with a range of different functionalities e.g. it implements local user profiling both for Context profile and Interests profiles (as detailed in Section V-B), protects user's privacy for their (private) sensitive attributes, and preserves user privacy for their apps usage behavior (Section V-C). In addition, it implements our proposed online control algorithm for jointly optimising user privacy and cost; presented in Section VI. To enable this functionality, a user need to install and run the 'System App'; this approach is similar to existing app recommender systems, e.g. AppBrain. The 'System App' acquires information about the set of currently installed apps on a device, interacts with the user in regards to the selection of private attributes to be protected. Furthermore, it evaluates the list of candidates obfuscating apps and presents to the user and automates the process of installation and running of these apps.
The 'System App' sends the installed app information to the 'System Engine' for calculating obfuscating apps and automates the installation and running of these apps, as shown in Figure 6. We have used various utilities of the Android SDK in our implementation. E.g., PackageManager is used to retrieve the list of installed apps. The meta data of the installed apps, such as app name, permissions etc., are obtained by calling the function getInstalledApplications (PackageManager.GET_META_DATA). The 'System Engine' evaluates obfuscating apps according to the criteria discussed in Section V-B by examining the 'Local app repository'; we suggest that this repository is updated by the advertising system so that the 'System App' can calculate obfuscating apps from various app's categories.
The 'System Engine' module forwards the list of obfuscating apps to the 'System App'. Each app is displayed to the user as an accessible hyperlink to Google Play (or 'Apple's App Store') store, which is done by invoking the startActivity (new Intent(Intent.ACTION_VIEW, Uri.parse ("market://details?id=" +appPackageName))). The activity name (the app name that can be recognised by app market of the app to be installed) is specified using appPackageName function using the Intent class. These apps are automated and run using the startActivity utility of Android SDK for a specified amount of time that is required to generate new profile interests in the ad system.

B. EXPERIMENTAL SETUP
In this work, we mainly focus on Google AdMob since it is the leading marketplace in mobile user profiling and has captured the online digital advertising market, however, we note that the proposed methods can be effortlessly applied to other ad/analytic networks. We carry out various experimentations for preserving profiling privacy (i.e. I t g + η t l (Ψ l )), apps usage (I t g − U t (K a )) and evaluating various resource overheads R. To evaluate this, we select apps from 27 random (sample) categories; we note that Google Play store has organised apps into 37 (considering 'Games' as a single category) various categories e.g. 'Entertainment', 'Lifestyle' etc. For these experimentations, we select top 100 free apps from randomly chosen 27 categories and further narrow down it to 10 highest ranked apps from each category. As mentioned earlier that the Google user profile is partly based on mobile apps usage and relevance of received ads, hence, we had to ensure that the tested apps receive ads.
We set a second (mapping) experimental setup; we select one mobile app from the top list of any category and run it for a period of up to 96 hours; note that this process was automated, as described above in Section VIII-A. The purpose of these experiments was to evaluate mapping of specific Context profile to Interest profile i.e. K a → I g , as discussed in Section IV-B. This helps in determining contribution of individual app in an Interest profile. We note that findings from these experiments help in selecting recommended apps for various disturbances in user profiles i.e. achieving different trade-offs between privacy and targeted ads, as explained in Eq. (23), (25), and (24) e.g. 0 < η l (Ψ l ) ≤ η min l (Ψ l ) for lower privacy and higher targeted ads. These experiments have taken around 3 months to complete for the 270 highest ranked apps from 27 random apps categories.
The ads traffic, including the control traffic, exchanged for tracking/profiling purposes, was collected using tcpdump, cleansed and saved to a local database during entire experimentations. We reset the profile before starting each experiment in order to make sure that the Interest profile is only resulting from the currently installed and actively used apps. In addition, we set up a phone with the same configuration however with 'Opt-out of Ads Personalisation' enabled in Google Settings system app. The purpose of this phone is to have a base reference for both newly generated user profiles and received targeted ads. These experiments were run for all the selected categories 24/7 for 5 months; due to the practical limitations of these experiments, we only use 10 smartphones in parallel.
We also used the collected traffic from these experiments for calculating the resource usage; detailed in Section VII-B. Figure 8 compares the apps usage profile for lower and higher profile disruption by introducing lower (as shown in (a)) and higher (i.e. (b)) activity of recommended apps; note the original user profile activity is shown with rectangular boxes; already discussed in Figure 4. We first divide the time into 5 min duration bins and then record the number of ads requests in each bin (there are different number of ad requests by each app; as detailed in Section VIII-E); each dot (in plot) in both Figures 8 (a) and (b) show the ad request frequency during the 24 hours time (shown on x-axis) and their corresponding ads requests frequency (i.e. shown on y-axis).

C. TRADE-OFF BETWEEN APPS USAGE PRIVACY AND COST OF PROFILE DISRUPTION
The recommended apps were run during various (day/night) times where there was no original (K a ) app's activity e.g., during 1am-6am, as indicated in both figures. We note that, during 1am-6am with 'lower app usage and low profile disruption', newly apps request 27 ad (binned) requests with a total of fetching 93 ads, while it is respectively 93 and 1323 for 'higher app usage and high profile disruption'. In addition, the average number of ad requests and corresponding ads fetched were 19.6 and 60 for 'lower app usage and low profile disruption', while it was 77.6 and 809.6 for 'higher app usage and high profile disruption'. Overall, there were 340 and 5068 ad requests respectively with two options of profile disruptions; note that this also has an effect over various overheads, in particular, the Communication overhead, also discussed later in Section VIII-E. Figure 9 (a) and (b) show frequency distribution of ads for lower and higher apps usage; the percentage of ads for presented ads in different time-bins is also shown on top of each bar. For each time bin, the presented ads were ranged from 1 to 16 ads requests, in particular, the frequency of these requests range from 1 to 9 ads requests for 'lower apps usage', which also attracted a lower number of ads i.e. only 85 ads to completely apps usage privacy as shown in Figure 9 (a). Similarly, for 'higher apps usage', the ad requests range from 1 to 16 per time bin with an average of 32 ads for all frequency of ads, with a total of 480 ads to completely preserve user's privacy. Recall that, the selection of such apps usages are mainly done for two purposes i.e. to protect apps usage privacy and to affect the targeted ads.

E. PRIVACY PROTECTION VS. RESOURCE USE
It is reported by previous works [35], [45], and is also evident in our experimental results, that the ads (and its related tracking traffic) are the major contributor to the communication R t c (a i,j ) overhead. Hence, we approximate the communication cost by the traffic generated by ads and their related actions by users.   We note that, from the bandwidth viewpoint, the ads traffic is characterised by various components: The ad refresh rate (technically it is the inter-arrival time of two consecutive ads); their correspondence with various ad/analytic servers with an advertising ecosystem; contacts with CDN for downloading various ad components e.g. images etc.; the number of objects associated with an ad along with their sizes; and communication with various servers during interactions with an ad. Table 2 shows various ad-related objects and control messages, along with their sizes; determined from collected traffic traces.
During our experimentations, we examined, from collected traces, that an ad size is 16±4KBs, which (on average) contains 8-10 objects (e.g. JavaScript files, images, etc.) with an average of 30-35 request/response messages. In addition, we note that ad refresh rates vary between 20-60 seconds, with distinct values of 20, 30, 45 and 60 seconds, which respectively to 36%, 47%, 15% and 2% of all the tested apps. Since the ad sizes do not vary widely, hence the AdMob refers this to rate, which is deterministic for every app and is configured by apps developer at the time of registering apps on Google Play store.
We note that supported values are 12-120 seconds in Google AdMob.
communication overhead for introducing lower disturbance in a user profile can be further minimised by selecting those apps that have maximum overall ad refresh rate; note that this information is already available with any ad network e.g. Google AdMob.  Figure 10 shows the distribution of bandwidth used by apps during our experimentations per our experimental setup, shown in Section VIII-B. The proportion of apps that have consumed the respective bandwidths are also shown on top of each bar e.g. 26.30% of the apps (which corresponds to 71 apps) consumes a high bandwidth of 4.0-4.5MB. In addition,  we note that apps that frequently fetch ads i.e. apps with ad refresh rates of 20 and 30 seconds, utilise an average bandwidth of 3.0-5.5MB; these apps represent around 70% of all the experimented apps. Note that the selection of such apps can be used for introducing higher disruption in user profiles, to attract lower targeted ads, and to achieve high apps usage privacy. The remaining 30% of the apps (i.e. apps with ad refresh rates of 45 and 60 seconds) utilise between 0.5-2.5MB communication bandwidth. Subsequently, we evaluate the processing overhead introduced by experimented apps. We note that the measured CPU usage varies, although not widely, across different apps: the CPU-intensive apps such as those from the 'Games' category use between 25% to 30% of the CPU power; less-interactive apps such as Notes use between 15% to 20%.
Similarly, the battery consumption measured in our experiments shows a relatively low variations between various apps, with between 30% to 40% of the total battery (i.e. 100%) being used by each app during the measurement period.
In addition to above basic overheads, we evaluated storage space consumed by recommended apps; we mainly determined the installation, data, cache storage spaces. These storage spaces vary for apps according to their requirements e.g. a 'language translation' app might do offline text translation, hence it would require to save library files in data storage space quota requiring more space compared to installation storage space. In contrast, the Facebook would consume more data storage space to store user accounts data, search history, group settings, user timeline data etc. In addition, a Google Maps would take more cache storage space to save user's searched places history etc. Table 3 presents few representative apps for combination of these storage space requirements; following we present further details. Figures 11(a) through (c) show the distribution of storage spaces of the experimented apps; the proportion of apps that requires respective storage space are also shown on top of each bar. It can be observed, for installation storage, that nearly half (54%) of the experimented apps acquires lower storage space i.e. 0.5MB to 10MB, while only 1.48% of these apps consume relatively higher storage space of 50-60MB. Similar observations can be found for data and cache overhead, as shown in Figures 11 (b) and (c). In conclusion, we observe that vast majority of these apps use a lower amount of storage space; e.g., 80%, 97%, and 98% of all apps belong to the lowest storage space bins, i.e. within the range of 0.5-20MB, 0.01-20MB, and 0.01-10MB storage consumption, respectively for installation, data, and cache storage overheads.

IX. DISCUSSION
We discuss the applicability of our framework in the current advertising ecosystem, in addition to, comparing the proposed framework for various privacy protection approaches for the presented threat model.

A. PROTECTING SENSITIVE PROFILING INTERESTS VIA DIFFERENTIAL PRIVACY
The concept of differential privacy was introduced in [46], a mathematical definition for privacy loss associated with any released data fetched from a database. A deeper understanding of differential privacy and its algorithms can be found in [47]. Let D 1 = g k,l ∈ S i.e. the set of interests in a user profile, and D 2 = g k,l ∈ S g , where Ψ l = Ψ l and g k,l is the profiling interest(s) other than the primary set of interests defined by an advertising company, Ψ l is a selected private interests category that the user wants to protect. Subsequently, the randomised function K gives differential privacy for these two data sets as: We examine the use of other privacy protection mechanisms that can also be utilised e.g. private information retrieval, anonymisation, randomisation, Blockchain-based solutions, however, we note that these solutions do not fully protect user's privacy for the presented threat model; following we provide a comparative analysis of the use of various other privacy protection mechanisms. Table 4 provides a hypothetical comparison of various privacy protection mechanisms using different parameters, eval-A C++ implementation of differential privacy library can be found at http s://github.com/google/differential-privacy.   uated in our proposed framework. It can be observed that only the proposed mechanism of introducing recommended obfuscation apps protects the user's privacy for 'app's usage behavior' since the user has to run these apps in order to protect usage behavior at different periods of the day/night. Similarly, an important parameter is the 'trade-off between privacy and targeted ads', which can only be achieved using the proposed mechanism and the randomisation and obfuscation. Furthermore, another parameter is to protect 'user privacy in terms of serving targeted ads' (an indirect privacy attack to expose user privacy), which can be adjusted according to user's needs i.e. 'low-relevant vs. high-relevant interest-based ads'. We plan to carry out a comprehensive study over these parameters for these various privacy protection mechanisms in the future in order to validate/invalidate our hypotheses.

C. APPLICABILITY OF PROPOSED FRAMEWORK IN ADVERTISING SYSTEM 1) Overall System Functionality
The motivation for protecting user privacy is very much dependent on the way consumers use mobile apps and access the internet; users are ever more concerned about preserving their privacy due to an enormous increase in its awareness. Such an example of awareness motivation is by exposure of mass surveillance activities and by unauthorised leaks of personal data. Hence, users have ever more interested in the use of personal (bespoke) privacy tools. Thus the proposed framework is considered, inline with the current apps recommender and personalisation systems, that not only suggest usably recommended apps but also enable an optimised and cost-effective privacy protection mechanisms. As we have seen in Section VIII-C, this framework optimises the cost by selecting various number of apps for protecting apps usage privacy, protects user's private interest profile and achieves optimal trade-off between privacy protect and the targeted ads. Furthermore, this framework ensures that the recommended apps have an overall good usability to users, based on similarity metric discussed in Section VII-A.
http://www.theguardian.com/world/the-nsa-files An important user's concern is the issue related to resource use, we note during our mapping experimentation (as detailed in Section VIII-B) that these experiments can help to significantly reduce the resource use by selecting appropriate apps that could (subject to availability of such mapping information) effectively preserve user privacy with least overhead. Using these experiments, on average, the communication cost is around 3MB (for lower profile disruption) compared to 17MB for higher profile disruption, to achieve apps usage and profiling privacy. This would motivate users to use such a strategy, in particular, those who might be on a fixed-mobile data plan. As discussed earlier that AdMob profiling is based on 'Web & App' activity, hence such (mapping) information can also be added for user's web searches/histories; although this app-based strategy is still applicable to protect user privacy for interest profiling as profiling is now done via both 'Web & App' activity. We envisage that such information could be made derived by approximating user profile based on various related information available, such as search/history keywords, similar to interest mapping discussed in Section IV-B for K a .

2) Server Side Modifications
Integrating the proposed framework within the existing advertising ecosystem would be fairly straightforward and would only require upgrading tracking and analytics mechanism of the server-side i.e. we suggest the advertising system transfer such functionalities to the client side; where the client remains honest. These changes are mainly related to Aggregation and Analytics servers; these servers would respectively receive the constructed user profile (along with anonymous apps usage statistics without including user's Advertising ID) and other statistics, e.g. ad impression/clicks, etc., required by both advertising system and apps developers.

3) Client Side Modifications
A major change on the client side would require implementation via 'System App' i.e. user profiling, optimising user's privacy for targeted ads, their interactions with ads and app usage privacy including the 'Obfuscation engine' for selection of recommended apps. Currently, the mobile integrates with the advertising ecosystem via SDK, hence it will mainly require modifications in client's AdMob SDK.

D. RESEARCH LIMITATIONS
The proposed system protects users' privacy against legitimate user profiling and the traffic monitoring/analysis and network surveillance both form the host advertising systems and third-party tracker/analytics. However, we did not address location-based privacy for ads targeting in much more detail, although this can also be handled by our privacy protection approach since the location is added as one of the interests in user profile as part of the demographics, as detailed in Section III-A. Henceforth, for location-specific ads, low-resolution GPS coordinates can be included in user profile to accommodate advertisers and businesses wishing to advertise to passing trades. In addition, location can be protected using other protection mechanisms such as 'Tor'; to use in conjunction with our system to prevent such threats.

X. CONCLUSION
The online mobile targeted advertising is growing in popularity, which at the same time has raised serious privacy concerns among individuals, enabled via the use of excellent user-tracking tools. This paper presents an optimal privacypreserving and cost-effective framework for preserving user privacy due to user profiling, ads-based inferencing, and ads targeting and (in general) user's behavior over mobile devices. We present a dynamic optimisation framework by first examining the underlying advertising ecosystem and then providing a privacy-preserving framework for temporal changes that occur in the user environment in an ad ecosystem. The online control algorithm is used to detect temporal changes in user profiling based on Lyapunov optimization to achieve an optimal solution without requiring any knowledge of future use of mobile apps or temporal changes in a user profile. We carry out extensive experimentations using mobile devices of various profiles and we examine the profiling process, privacy leakage, privacy protection, and resource usage. We develop a POC 'System App' that implements critical components of the proposed framework and further discuss its applicability in an online advertising ecosystem.