Skip to content

Commit 7f24f1e

Browse files
docs: Update readme about jar name. (#73)
* update * update
1 parent b0eb9e0 commit 7f24f1e

File tree

3 files changed

+22
-15
lines changed

3 files changed

+22
-15
lines changed

.readme-partials.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ custom_content: |
2222
<!--
2323
| Scala version | Connector Artifact |
2424
| --- | --- |
25-
| Scala 2.11 | `com.google.cloud.pubsublite.spark:pubsublite-spark-sql-streaming-with-dependencies_2.11:0.1.0` |
25+
| Scala 2.11 | `com.google.cloud.pubsublite.spark:pubsublite-spark-sql-streaming:0.1.0:with-dependencies` |
2626
-->
2727
2828
<!--- TODO(jiangmichael): Add exmaple code and brief description here -->

samples/README.md

Lines changed: 21 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,24 @@ PARTITIONS=1 # or your number of partitions to create
1919
CLUSTER_NAME=waprin-spark7 # or your Dataproc cluster name to create
2020
BUCKET=gs://your-gcs-bucket
2121
SUBSCRIPTION_PATH=projects/$PROJECT_NUMBER/locations/$REGION-$ZONE_ID/subscriptions/$SUBSCRIPTION_ID
22-
PUBSUBLITE_SPARK_SQL_STREAMING_JAR_LOCATION= # downloaded pubsublite-spark-sql-streaming-with-dependencies jar location
22+
CONNECTOR_VERSION= # latest pubsublite-spark-sql-streaming release version
23+
PUBSUBLITE_SPARK_SQL_STREAMING_JAR_LOCATION= # downloaded pubsublite-spark-sql-streaming-$CONNECTOR_VERSION-with-dependencies jar location
2324
```
2425

2526
## Running word count sample
2627

2728
To run the word count sample in Dataproc cluster, follow the steps:
2829

2930
1. `cd samples/`
30-
2. Create the topic and subscription, and publish word count messages to the topic.
31+
2. Set the current sample version.
32+
```sh
33+
SAMPLE_VERSION=$(mvn -q \
34+
-Dexec.executable=echo \
35+
-Dexec.args='${project.version}' \
36+
--non-recursive \
37+
exec:exec)
38+
```
39+
3. Create the topic and subscription, and publish word count messages to the topic.
3140
```sh
3241
PROJECT_NUMBER=$PROJECT_NUMBER \
3342
REGION=$REGION \
@@ -37,32 +46,31 @@ To run the word count sample in Dataproc cluster, follow the steps:
3746
PARTITIONS=$PARTITIONS \
3847
mvn compile exec:java -Dexec.mainClass=pubsublite.spark.PublishWords
3948
```
40-
3. Create a Dataproc cluster
49+
4. Create a Dataproc cluster
4150
```sh
4251
gcloud dataproc clusters create $CLUSTER_NAME --region=$REGION --zone=$REGION-$ZONE_ID --image-version=1.5-debian10 --scopes=cloud-platform
4352
```
44-
4. Package sample jar
53+
5. Package sample jar
4554
```sh
4655
mvn clean package -Dmaven.test.skip=true
4756
```
48-
<!-- TODO: set up bots to update jar version, also provide link to maven central -->
49-
5. Download `pubsublite-spark-sql-streaming-with-dependencies-0.1.0.jar` from Maven Central and set `PUBSUBLITE_SPARK_SQL_STREAMING_JAR_LOCATION` environment variable.
50-
<!-- TODO: set up bots to update jar version -->
51-
6. Create GCS bucket and upload both `pubsublite-spark-sql-streaming-with-dependencies-0.1.0.jar` and the sample jar onto GCS
57+
<!-- TODO: provide link to maven central -->
58+
6. Download `pubsublite-spark-sql-streaming-$CONNECTOR_VERSION-with-dependencies.jar` from Maven Central and set `PUBSUBLITE_SPARK_SQL_STREAMING_JAR_LOCATION` environment variable.
59+
7. Create GCS bucket and upload both `pubsublite-spark-sql-streaming-$CONNECTOR_VERSION-with-dependencies.jar` and the sample jar onto GCS
5260
```sh
5361
gsutil mb $BUCKET
54-
gsutil cp snapshot/target/pubsublite-spark-snapshot-1.0.21.jar $BUCKET
62+
gsutil cp snapshot/target/pubsublite-spark-snapshot-$SAMPLE_VERSION.jar $BUCKET
5563
gsutil cp $PUBSUBLITE_SPARK_SQL_STREAMING_JAR_LOCATION $BUCKET
5664
```
57-
7. Set Dataproc region
65+
8. Set Dataproc region
5866
```sh
5967
gcloud config set dataproc/region $REGION
6068
```
6169
<!-- TODO: set up bots to update jar version -->
62-
8. Run the sample in Dataproc
70+
9. Run the sample in Dataproc
6371
```sh
6472
gcloud dataproc jobs submit spark --cluster=$CLUSTER_NAME \
65-
--jars=$BUCKET/pubsublite-spark-snapshot-1.0.21.jar,$BUCKET/pubsublite-spark-sql-streaming-with-dependencies-0.1.0.jar \
73+
--jars=$BUCKET/pubsublite-spark-snapshot-$SAMPLE_VERSION.jar,$BUCKET/pubsublite-spark-sql-streaming-$CONNECTOR_VERSION-with-dependencies.jar \
6674
--class=pubsublite.spark.WordCount -- $SUBSCRIPTION_PATH
6775
```
6876

@@ -74,7 +82,7 @@ To run the word count sample in Dataproc cluster, follow the steps:
7482
```
7583
2. Delete GCS bucket.
7684
```sh
77-
gsutil -m rm -rf $BUCKET_NAME
85+
gsutil -m rm -rf $BUCKET
7886
```
7987
3. Delete Dataproc cluster.
8088
```sh

samples/pom.xml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
<modelVersion>4.0.0</modelVersion>
44
<groupId>com.google.cloud</groupId>
55
<artifactId>google-cloud-pubsublite-spark-samples</artifactId>
6-
<version>0.0.1-SNAPSHOT</version><!-- This artifact should not be released -->
76
<packaging>pom</packaging>
87
<name>Google Pub/Sub Lite Spark Connector Samples Parent</name>
98
<url>https://github.com/googleapis/java-pubsublite-spark</url>

0 commit comments

Comments
 (0)